CN116091778B - Semantic segmentation processing method, device and equipment for data - Google Patents

Semantic segmentation processing method, device and equipment for data Download PDF

Info

Publication number
CN116091778B
CN116091778B CN202310308128.3A CN202310308128A CN116091778B CN 116091778 B CN116091778 B CN 116091778B CN 202310308128 A CN202310308128 A CN 202310308128A CN 116091778 B CN116091778 B CN 116091778B
Authority
CN
China
Prior art keywords
results
data
processing
segmentation
coupling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310308128.3A
Other languages
Chinese (zh)
Other versions
CN116091778A (en
Inventor
刘丰华
侯涛
魏建权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wuyi Vision Digital Twin Technology Co ltd
Original Assignee
Beijing Wuyi Vision Digital Twin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wuyi Vision Digital Twin Technology Co ltd filed Critical Beijing Wuyi Vision Digital Twin Technology Co ltd
Priority to CN202310308128.3A priority Critical patent/CN116091778B/en
Publication of CN116091778A publication Critical patent/CN116091778A/en
Application granted granted Critical
Publication of CN116091778B publication Critical patent/CN116091778B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The disclosure relates to a semantic segmentation processing method, device and equipment for data. The semantic segmentation processing method of the data comprises the following steps: acquiring data to be processed of a target scene; dividing the data to be processed to obtain a plurality of data blocks to be semantically divided; inputting a plurality of data blocks into a context coupling semantic segmentation model to carry out segmentation processing to obtain a plurality of segmentation results, wherein the context coupling semantic segmentation model comprises context sampling interpolation branches and direct sampling branches with different scale characteristics; and performing splicing treatment on the multiple segmentation results to obtain a final semantic segmentation result of the target scene. The scheme disclosed by the invention can be suitable for data segmentation of large-scale natural scenes.

Description

Semantic segmentation processing method, device and equipment for data
Technical Field
The disclosure belongs to the technical field of computer information processing, and particularly relates to a semantic segmentation processing method, device and equipment for data.
Background
Along with development and promotion of computer technology and digitalization, effective segmentation of large scene point cloud data becomes an important research direction, and a point cloud segmentation network based on deep learning benefits from better generalization capability and segmentation performance with stronger robustness, and is widely focused; but this type of approach is often difficult to adapt to large-scale natural scenes.
Disclosure of Invention
The embodiment of the disclosure aims to provide a semantic segmentation processing method, device and equipment for data, which realize data segmentation adapting to large-scale natural scenes.
In a first aspect, an embodiment of the present disclosure provides a method for processing semantic segmentation of data, where the method includes:
obtaining to-be-processed data of a target scene, wherein the to-be-processed data is laser point information obtained by scanning a surface of a target object through a laser beam according to a preset track and reflecting the laser beam; the information carried by the laser point comprises: azimuth and distance;
dividing the data to be processed to obtain a plurality of data blocks to be semantically divided;
inputting a plurality of data blocks into a context coupling semantic segmentation model to carry out segmentation processing to obtain a plurality of segmentation results, wherein the context coupling semantic segmentation model comprises context sampling interpolation branches and direct sampling branches with different scale characteristics;
and performing splicing treatment on the multiple segmentation results to obtain a final semantic segmentation result of the target scene.
Optionally, acquiring data to be processed of the target scene includes:
reading vertexes of the OSGB (object scene oriented oblique scanning data) grid;
Extracting vertex coordinates of triangular faces in the grid to form sparse data;
proportional contraction is carried out on the vertexes of the triangular surface in the sparse data, and coordinates of new vertexes are obtained;
and performing two-dimensional texture coordinate mapping on the coordinates of the new vertexes on the triangular surface to obtain data to be processed.
Optionally, the partitioning processing is performed on the data to be processed to obtain a plurality of data blocks to be semantically partitioned, including:
semantic annotation is carried out on the data to be processed to obtain the annotated data to be processed;
dividing the marked data to be processed by using a preset standard area square block to obtain a plurality of data blocks to be semantically segmented.
Optionally, inputting the plurality of data blocks into the context-coupled semantic segmentation model for segmentation processing to obtain a plurality of segmentation results, including:
reconstructing the surface features of the plurality of data blocks to obtain local spatial features;
inputting local spatial features into a plurality of first encoders in a context sampling interpolation branch of a context coupling semantic segmentation model for processing to obtain a plurality of first encoding processing results;
inputting the local spatial features into a plurality of second encoders in a direct sampling branch of the context coupling semantic segmentation model for processing to obtain a plurality of second encoding processing results; the first encoder in the context sample interpolation branch is different from the second encoder in the direct sampling branch in coding scale characteristics;
And obtaining a plurality of segmentation results according to the plurality of first coding processing results and the plurality of second coding processing results.
Optionally, obtaining a plurality of segmentation results according to the plurality of first coding processing results and the plurality of second coding processing results includes:
respectively carrying out multi-scale coupling on n second coding processing results in the plurality of second coding processing results to obtain n coupling results;
respectively cascading the n coupling results with n first coding processing results to obtain n cascading results;
inputting n cascade results and encoding processing results which are not subjected to multi-scale coupling into a plurality of decoders respectively to decode and recover density, so as to obtain a plurality of segmentation results; n is a positive integer.
Optionally, performing multi-scale coupling on N second coding processing results in the plurality of second coding processing results to obtain N coupling results, including:
according to
Figure SMS_1
Respectively carrying out multi-scale coupling on the n second coding processing results to obtain n coupling results;
wherein ,
Figure SMS_2
sequences formed for n coupling results, +.>
Figure SMS_3
As a function of the non-linear transformation,
Figure SMS_4
,/>
Figure SMS_5
for a first one of the n second encoding process results,
Figure SMS_6
for the second one of the n second coding results,/for the second one of the n second coding results >
Figure SMS_7
Is the nth second encoding processing result.
Optionally, performing a stitching process on the multiple segmentation results to obtain a final semantic segmentation result of the target scene, where the stitching process includes:
and performing OR operation on the overlapped parts of the plurality of segmentation results to obtain a final semantic segmentation result of the target scene.
In a second aspect, an embodiment of the present disclosure provides a semantic division processing apparatus for data, including:
the acquisition module is used for acquiring data to be processed of a target scene, wherein the data to be processed is laser point information obtained by scanning a surface of a target object through a laser beam according to a preset track and reflecting the laser beam; the information carried by the laser point comprises: azimuth and distance;
the first processing module is used for carrying out segmentation processing on the data to be processed to obtain a plurality of data blocks to be semantically segmented;
the second processing module is used for inputting the plurality of data blocks into the context coupling semantic segmentation model to perform point cloud segmentation processing to obtain a plurality of segmentation results, wherein the context coupling semantic segmentation model comprises context sampling interpolation branches and direct sampling branches with different scale characteristics;
and the third processing module is used for performing splicing processing on the plurality of segmentation results to obtain a final semantic segmentation result of the target scene.
Optionally, the acquiring module includes:
the first acquisition sub-module is used for reading vertexes of the OSGB (oblique scanning data) grid of the target scene;
the second acquisition submodule is used for extracting vertex coordinates of triangular faces in the grids to form sparse data;
the third acquisition submodule is used for carrying out proportional contraction on the vertexes of the triangular surface in the sparse data to obtain the coordinates of the new vertexes;
and the fourth acquisition sub-module is used for carrying out two-dimensional texture coordinate mapping on the coordinates of the new vertexes on the triangular surface to obtain data to be processed.
Optionally, the first processing module includes:
the first processing sub-module is used for carrying out semantic annotation on the data to be processed to obtain the annotated data to be processed;
the second processing sub-module is used for dividing the marked data to be processed by using a preset standard area square block to obtain a plurality of data blocks to be subjected to semantic segmentation.
Optionally, the second processing module includes:
the third processing sub-module is used for reconstructing the surface characteristics of the plurality of data blocks to obtain local space characteristics;
a fourth processing sub-module, configured to input local spatial features into a plurality of first encoders in a context sampling interpolation branch of the context coupling semantic segmentation model for processing, so as to obtain a plurality of first encoding processing results;
A fifth processing sub-module, configured to input the local spatial feature into a plurality of second encoders in a direct sampling branch of the context-coupled semantic segmentation model for processing, so as to obtain a plurality of second encoding processing results; the first encoder in the context sample interpolation branch is different from the second encoder in the direct sampling branch in coding scale characteristics;
and the sixth processing submodule is used for obtaining a plurality of segmentation results according to the plurality of first coding processing results and the plurality of second coding processing results.
Optionally, the sixth processing submodule includes:
the first processing subunit is used for respectively carrying out multi-scale coupling on n second coding processing results in the plurality of second coding processing results to obtain n coupling results;
the second processing subunit is used for respectively cascading the n coupling results with the n first coding processing results to obtain n cascading results;
the third processing subunit is used for inputting the n cascade results and the encoding processing result which is not subjected to multi-scale coupling into a plurality of decoders respectively to decode and recover the density, so as to obtain a plurality of segmentation results; n is a positive integer.
Optionally, a firstThe processing subunit is according to
Figure SMS_8
Respectively carrying out multi-scale coupling on the n second coding processing results to obtain n coupling results;
wherein ,
Figure SMS_9
sequences formed for n coupling results, +.>
Figure SMS_10
As a function of the non-linear transformation,
Figure SMS_11
,/>
Figure SMS_12
for a first one of the n second encoding process results,
Figure SMS_13
for the second one of the n second coding results,/for the second one of the n second coding results>
Figure SMS_14
Is the nth second encoding processing result.
Optionally, the third processing module includes:
and the seventh processing sub-module is used for performing OR operation on the overlapped parts of the plurality of segmentation results to obtain a final semantic segmentation result of the target scene.
In a third aspect, embodiments of the present disclosure provide a computing device comprising a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction implementing the steps of the method of semantic segmentation processing of data as in the first aspect when executed by the processor.
In a fourth aspect, embodiments of the present disclosure provide a readable storage medium having stored thereon a program or instructions which, when executed by a processor, implement the steps of a method for semantic segmentation processing of data as in the first aspect.
In the embodiment of the disclosure, different context representations are obtained through the context sampling interpolation branch and the direct sampling branch of the context coupling semantic segmentation model, and data with different densities are interacted, so that more accurate neighborhood description is obtained, and the method is suitable for data segmentation of a large-scale natural scene.
Drawings
FIG. 1 is a flow chart of a method of semantic segmentation processing of data provided by an embodiment of the present disclosure;
fig. 2 is a schematic diagram of a framework of a context coupling point cloud semantic segmentation model in a semantic segmentation processing method of data provided by an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a process flow of a multi-scale coupling module in a context-coupled semantic segmentation model provided by an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of data stitching provided by an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of an OR operation of multiple segmentation results provided by an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of a semantic segmentation processing device for data according to an embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a computing device provided by an embodiment of the present disclosure;
fig. 8 is a schematic diagram of a hardware architecture of a computing device provided to implement an embodiment of the present disclosure.
Detailed Description
Technical solutions in the embodiments of the present disclosure will be clearly described below with reference to the drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are some embodiments of the present disclosure, but not all embodiments. All other embodiments obtained by one of ordinary skill in the art based on the embodiments in this disclosure are within the scope of the present disclosure.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, where appropriate, such that embodiments of the disclosure may be practiced in sequences other than those illustrated and described herein, and that the objects identified by "first," "second," etc. are generally of the same type and are not limited to the number of objects, e.g., the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The embodiments of the present disclosure relate generally to point cloud data processing of a target scene, where different context representations are obtained through a context sample interpolation branch and a direct sampling branch of a context coupled point cloud semantic segmentation model, where the context sample interpolation branch can change a point cloud density, for example: the point cloud density is increased, and the point cloud data with different densities are interacted, so that more accurate neighborhood description is obtained, meanwhile, the method is suitable for the point cloud data segmentation of a large-scale natural scene, can be better adapted to the target complexity in the natural scene, and has better perceptibility on objects with different scales;
It should be noted that, since the point cloud data has spatial coordinates, the target scene to which the point cloud data is applied includes at least one of topographic mapping, elevation measurement of a target object, digital twinning, and unmanned driving. The elevation measurement according to the point cloud data can overcome the limitation of the traditional target elevation measurement, and the detailed elevation data can be rapidly acquired in an elevation scanning mode, so that the elevation measurement is more visual and efficient.
The method, the device, the system and the computing equipment for processing the semantic segmentation of the data provided by the embodiment of the disclosure are described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.
Fig. 1 is a flowchart of a method for processing semantic segmentation of data according to an embodiment of the present disclosure, referring to fig. 1, the method may include the following steps:
step 101, obtaining data to be processed of a target scene, wherein the data to be processed is laser point information obtained by scanning a surface of a target object through a laser beam according to a preset track and reflecting the laser beam; the information carried by the laser point comprises: azimuth and distance; the target scenarios herein include, but are not limited to: building facade measurement scenes, digital twin scenes, unmanned scenes, and the like.
When step 101 is specifically implemented, the method may include:
step 1011, reading the vertexes of the OSGB grid of the oblique scan data of the target scene;
step 1012, extracting vertex coordinates of triangular surfaces in the grid to form sparse data;
step 1013, performing proportional contraction on the vertexes of the triangular surface in the sparse data to obtain the coordinates of the new vertexes; here, it is preferable that the contraction ratio be expressed as barycentric coordinates of the triangular surface;
specifically, the set point P is a point on the triangular surface, and the barycentric coordinate of the point P with respect to the triangular surface is
Figure SMS_15
The formula can be used:
Figure SMS_16
calculating to obtain;
wherein ,
Figure SMS_17
three vertex coordinates of the triangular surface, respectively, and the barycentric coordinates satisfy +.>
Figure SMS_18
And->
Figure SMS_19
Are all greater than 0>
Figure SMS_20
The coordinates of the point P are shown, and n is the normal vector of the triangular surface;
in step 1014, two-dimensional texture coordinate mapping is performed on the coordinates of the new vertex on the triangular surface to obtain the data to be processed.
Here, two-dimensional texture coordinates (UV) of a new vertex on a triangular surface are mapped, the positioning of the newly generated point in the map data is obtained, the pixel data is retrieved, and sampling can be performed using bilinear interpolation, so that the reliability of the obtained data can be ensured.
The map data defines the position information of each point on the triangular surface, and the points can determine the position of the triangular surface texture map, and the two-dimensional texture coordinates (UV) are the positions of each point on the image accurately corresponding to the triangular surface.
102, dividing the data to be processed to obtain a plurality of data blocks to be semantically divided;
when step 102 is specifically implemented, it may include:
step 1021, performing semantic annotation on the data to be processed to obtain annotated data to be processed;
here, by performing semantic annotation on the data to be processed, high-precision semantic level annotation is obtained, for example: in the scene of measuring the building elevation, part of data to be processed can be marked as wall data, and the other part of data to be processed can be marked as window data; thus, the data to be processed can be clustered, and the data to be processed is refined.
The method can be used for carrying out semantic annotation processing on the data to be processed through technical means such as annotation software and the like, and when the data to be processed is annotated, overlapping semantic annotation on the data to be processed is avoided, so that the data to be processed is clearly divided.
And 1022, dividing the marked data to be processed by using a preset standard area square to obtain a plurality of data blocks to be semantically segmented.
Here, the marked data to be processed can be divided into a plurality of data blocks to be semantically segmented by using a preset standard area block;
Specifically, the data block segmentation process uses each data block as a unit to process, and a sampling anchor point is set
Figure SMS_21
Sampling width W, the area of the sampling region is +.>
Figure SMS_22
For any point in the marked data to be processed
Figure SMS_23
The formula for performing the calculation is:
Figure SMS_24
wherein the data points to be processed in the selected state are marked as 1, and the rest of the data points to be processed are marked as 0;
it should be noted that, each data block may be further subjected to random transformation, where the random transformation includes: equal proportion normalization, random scaling, random Z-axis rotation and random spatial translation;
here, the process of the equal ratio normalization may include:
by passing through
Figure SMS_25
Performing equal proportion normalization processing on each data block;
wherein ,
Figure SMS_26
is X length ,Y length and Zlength Maximum value of X length For the length of the data to be processed in the data block on the X coordinate axis, Y length The length of the data to be processed in the data on the Y coordinate axis, Z length The length of the data to be processed in the data block on the Z coordinate axis is the length; />
Figure SMS_27
Wherein P is a point in the data to be processed, < >>
Figure SMS_28
The point is obtained after the random transformation of the point P;
the process of random scaling may include:
by passing through
Figure SMS_29
Carrying out random scaling treatment on each point cloud data block; wherein (1) >
Figure SMS_30
Scaling the processed point for point P;
the process of random Z-axis rotation may include:
by passing through
Figure SMS_31
Carrying out random Z-axis rotation processing on each data block;
wherein ,
Figure SMS_32
a point after the random Z-axis rotation treatment is carried out on the point P;
the process of random spatial translation may include:
by passing through
Figure SMS_33
Carrying out random space translation processing on each data block; wherein (1)>
Figure SMS_34
And performing random space translation processing on the point P.
Step 103, inputting a plurality of data blocks into a context coupling semantic segmentation model to carry out segmentation processing to obtain a plurality of segmentation results, wherein the context coupling semantic segmentation model comprises context sampling interpolation branches and direct sampling branches with different scale characteristics;
here, the point cloud segmentation processing is performed on the plurality of data blocks through the context sampling interpolation branch and the direct sampling branch of the context coupling semantic segmentation model to obtain a plurality of segmentation results, so that effective segmentation for a large-scale target scene can be realized.
It should be noted that the context sampling interpolation branch here can change the density of the data to be processed in the data block.
As shown in fig. 2, when step 103 is specifically implemented, it may include:
step 1031, reconstructing surface features of the plurality of data blocks to obtain local spatial features;
The local spatial features here include: the normal direction of the domain decision and the center of the domain geometry;
wherein according to
Figure SMS_35
Determining a normal direction of the neighborhood decision; wherein (1)>
Figure SMS_36
For triangle face set composed of sides rotated clockwise in the neighborhood, ++>
Figure SMS_37
For the number of triangular faces in the neighborhood, it is required to be noted that the description of the normal direction depends on the neighborhood structure under a certain number;
according to
Figure SMS_38
Determining the center of the field geometry; wherein (1)>
Figure SMS_39
Is the center of the neighborhood geometry, +.>
Figure SMS_40
For points in the field, ++>
Figure SMS_41
The number of the points in the neighborhood;
step 1032, inputting the local spatial features into a plurality of first encoders in a context sampling interpolation branch of the context coupling semantic segmentation model for processing to obtain a plurality of first encoding processing results;
here, local spatial features are input to a plurality of first encoders in a context sampling interpolation branch of a context coupled semantic segmentation model for processing, changing density.
Step 1033, inputting the local spatial features into a plurality of second encoders in the direct sampling branch of the context-coupled semantic segmentation model for processing to obtain a plurality of second encoding processing results; the first encoder in the context sample interpolation branch is different from the second encoder in the direct sampling branch in coding scale characteristics;
It should be noted that, the context coupling semantic segmentation model uses a different pooling policy from the context sampling interpolation branch to express the domain feature context in different forms, so as to improve the expression capability of the model on the local context.
Step 1034, obtaining a plurality of segmentation results according to the plurality of first coding processing results and the plurality of second coding processing results.
In particular implementations, step 1034 may include:
step 10341, respectively performing multi-scale coupling on the n second coding processing results to obtain n coupling results; wherein according to
Figure SMS_42
Respectively carrying out multi-scale coupling on the n second coding processing results to obtain n coupling results;
wherein ,
Figure SMS_43
sequences formed for n coupling results, +.>
Figure SMS_44
As a function of the non-linear transformation,
Figure SMS_45
,/>
Figure SMS_46
for a first one of the n second encoding process results,
Figure SMS_47
for the second one of the n second coding results,/for the second one of the n second coding results>
Figure SMS_48
The result is the nth second coding processing result;
step 10342, respectively cascading the n coupling results with n first coding processing results to obtain n cascading results;
step 10343, inputting n cascade results and encoding processing results which are not subjected to multi-scale coupling into a decoder for decoding and recovering density respectively, and obtaining a plurality of segmentation results; n is a positive integer.
In one embodiment, the context sampling interpolation branch also includes 4 first encoders, and the direct sampling branch also includes 4 second encoders, where the number of the first encoders and the number of the second encoders are not limited to 4, but in practical implementation, the number of the first encoders and the number of the second encoders are the same; the same context-coupled semantic segmentation model comprises 4 decoders; the first encoder may be a strong encoder (p_encod) and the second encoder a weak encoder (l_encod);
wherein, the encoding result of the first encoder 1 and the encoding result of the second encoder 2, and the encoding result of the second encoder 1 and the encoding result of the second encoder 2 are directly input into the corresponding decoder 1 and decoder 2;
the encoding result of the second encoder 3 and the encoding result of the second encoder 4 are input into a multi-scale coupling module to be subjected to multi-scale coupling processing, a first coupling result F1 'is obtained, and the first coupling result F1' and the encoding result of the first encoder 3 are input into a decoder 3 to be subjected to decoding processing after being cascaded;
the coding result of the second coder 4 and the coding result of the second coder 3 are input into a multi-scale coupling module to be subjected to multi-scale coupling processing, so as to obtain a second coupling result F2', and the second coupling result F2' and the coding result of the first coder 4 are input into a decoder 4 to be subjected to decoding processing after being cascaded;
Fig. 3 is a schematic implementation diagram of a multi-scale coupling module provided by an embodiment of the disclosure, including:
processing the encoding result F1 of the second encoder 3 and the encoding result F2 of the second encoder 4 by a nonlinear transformation function Sigmoid respectively, and outputting a first coupling result F1' after information fusion of the obtained results; the information fusion can be performed by adopting a 1×1 convolution module, and is not limited to the 1×1 convolution module;
similarly, the encoding result F2 of the second encoder 4 and the encoding result F1 of the second encoder 3 are respectively processed by a nonlinear transformation function Sigmoid, and after the obtained results are subjected to information fusion, a second coupling result F2' is output; the information fusion can be performed by adopting a 1×1 convolution module, and is not limited to the 1×1 convolution module;
fig. 3 shows a schematic diagram of a second coupling result F2', the first coupling result F1' not being shown, since the process of obtaining the first coupling result F1 'is the same as the process of obtaining the second coupling result F2';
it should be noted that, the input in the multi-scale coupling module is not limited to the encoding results of two encoders, but may be the encoding results of a plurality of encoders, and the processing manner is the same as the coupling processing process of the encoding results of two encoders, which is not described herein.
In this embodiment, the encoding results of the plurality of second encoders are respectively subjected to multi-scale coupling processing to obtain a plurality of coupling results, the plurality of coupling results are respectively subjected to cascade processing with the encoding results of the plurality of first encoders, and the plurality of cascade results and the encoding processing results which are not subjected to multi-scale coupling are respectively input to a decoder for decoding processing to obtain a plurality of segmentation results, so that the spatial edge information loss caused by inaccurate estimation information in the sampling process can be alleviated.
And 104, performing splicing processing on the plurality of segmentation results to obtain a final semantic segmentation result of the target scene.
In particular, step 104 may include:
step 1041, performing an or operation on the overlapped portions of the multiple segmentation results to obtain a final semantic segmentation result of the target scene.
In this embodiment, the overlapping portions of the multiple segmentation results are subjected to or operation, so that the segmentation results can be spliced, a final semantic segmentation result of the target scene is obtained, and reliability of the splicing result is ensured.
Fig. 4 is a schematic diagram of a plurality of segmentation results according to an embodiment of the present disclosure, and an operation process of stitching the segmentation results is shown in fig. 5:
Splicing the first segmentation result and the second segmentation result, wherein the overlapping of the first segmentation result and the second segmentation result comprises a first group of overlapping data on the first segmentation result side and a second group of overlapping data on the second segmentation result side; when specific splicing is carried out, OR operation can be carried out on the two groups of overlapped data, and an operation result of the overlapped data is obtained; finally, the first segmentation result, the second segmentation result and the operation result of the overlapped data are spliced finally to obtain a final semantic segmentation result of the target scene;
specifically, as shown in fig. 5, when performing or operation on the two sets of overlapping data, the same data that are lighted up at the corresponding positions are combined, for example, A3 and A3' or operation is performed, A3' is lighted up, and A3' is reserved; a4 and A4' OR operations, A4 is illuminated, A4 remains; a5 and A5' OR operations, A5 is illuminated, A5 remains; a10 and A10' are both lighted, one of them is reserved, and the rest of overlapped data is not lighted, but belongs to normal overlapped data, and both are reserved, so that the operation result of the overlapped data can be obtained.
The embodiment achieves better segmentation effect when being used in a building scene, and provides reliable basis for the elevation measurement of the building.
In this embodiment, the process of processing data by using the context-coupled semantic segmentation model can alleviate the neighborhood context information loss and the domain semantic change after the pooled sampling.
According to the semantic segmentation processing method for the data in the embodiment of the disclosure, different context representations are obtained through the context sample interpolation branch and the direct sampling branch of the context coupling semantic segmentation model, and data with different densities are interacted, so that more accurate neighborhood description is obtained, meanwhile, the method is suitable for data segmentation of large-scale natural scenes, can be better suitable for target complexity in the natural scenes, and has better perceptibility on objects with different scales.
All the above optional technical solutions may be combined arbitrarily to form an optional embodiment of the present disclosure, which is not described here in detail.
Fig. 6 is a schematic structural diagram of a data semantic segmentation processing apparatus according to an embodiment of the present disclosure, referring to fig. 6, the apparatus 600 includes:
the acquisition module 601 is configured to acquire to-be-processed data of a target scene, where the to-be-processed data is laser point information obtained by scanning a surface of a target object with a laser beam according to a preset track and reflecting the laser beam; the information carried by the laser point comprises: azimuth and distance;
The first processing module 602 is configured to perform segmentation processing on data to be processed to obtain a plurality of data blocks to be semantically segmented;
the second processing module 603 is configured to input a plurality of data blocks into a context-coupled semantic segmentation model to perform segmentation processing, so as to obtain a plurality of segmentation results, where the context-coupled semantic segmentation model includes context sample interpolation branches and direct sampling branches with different scale features;
and a third processing module 604, configured to perform a stitching process on the multiple segmentation results, so as to obtain a final semantic segmentation result of the target scene.
Optionally, the acquiring module 601 includes:
the first acquisition sub-module is used for reading vertexes of the OSGB (oblique scanning data) grid of the target scene;
the second acquisition submodule is used for extracting vertex coordinates of triangular faces in the grids to form sparse data;
the third acquisition submodule is used for carrying out proportional contraction on the vertexes of the triangular surface in the sparse data to obtain the coordinates of the new vertexes;
and the fourth acquisition sub-module is used for carrying out two-dimensional texture coordinate mapping on the coordinates of the new vertexes on the triangular surface to obtain data to be processed.
Optionally, the first processing module 602 includes:
the first processing sub-module is used for carrying out semantic annotation on the data to be processed to obtain the annotated data to be processed;
The second processing sub-module is used for dividing the marked data to be processed by using a preset standard area square block to obtain a plurality of data blocks to be subjected to semantic segmentation.
Optionally, the second processing module 603 includes:
the third processing sub-module is used for reconstructing the surface characteristics of the plurality of data blocks to obtain local space characteristics;
a fourth processing sub-module, configured to input local spatial features into a plurality of first encoders in a context sampling interpolation branch of the context coupling semantic segmentation model for processing, so as to obtain a plurality of first encoding processing results;
a fifth processing sub-module, configured to input the local spatial feature into a plurality of second encoders in a direct sampling branch of the context-coupled semantic segmentation model for processing, so as to obtain a plurality of second encoding processing results; the first encoder in the context sample interpolation branch is different from the second encoder in the direct sampling branch in coding scale characteristics;
and the sixth processing submodule is used for obtaining a plurality of segmentation results according to the plurality of first coding processing results and the plurality of second coding processing results.
Optionally, the sixth processing submodule includes:
the first processing subunit is used for respectively carrying out multi-scale coupling on n second coding processing results in the plurality of second coding processing results to obtain n coupling results;
The second processing subunit is used for respectively cascading the n coupling results with the n first coding processing results to obtain n cascading results;
the third processing subunit is used for inputting the n cascade results and the encoding processing result which is not subjected to multi-scale coupling into a plurality of decoders respectively to decode and recover the density, so as to obtain a plurality of segmentation results; n is a positive integer.
Optionally, the first processing subunit is according to
Figure SMS_49
Respectively carrying out multi-scale coupling on the n second coding processing results to obtain n coupling results;
wherein ,
Figure SMS_50
sequences formed for n coupling results, +.>
Figure SMS_51
As a function of the non-linear transformation,
Figure SMS_52
,/>
Figure SMS_53
for a first one of the n second encoding process results,
Figure SMS_54
for the second one of the n second coding results,/for the second one of the n second coding results>
Figure SMS_55
Is the nth second encoding processing result.
Optionally, the third processing module 604 includes:
and the seventh processing sub-module is used for performing OR operation on the overlapped parts of the plurality of segmentation results to obtain a final semantic segmentation result of the target scene.
According to the device provided by the embodiment of the disclosure, different context representations are obtained through the context sampling interpolation branch and the direct sampling branch of the context coupling semantic segmentation model, and the data with different densities are interacted, so that more accurate neighborhood description is obtained, meanwhile, the device is suitable for data segmentation of large-scale natural scenes, can be better suitable for target complexity in the natural scenes, and has better perceptibility on objects with different scales.
It should be noted that: in the semantic division processing device for data provided in the above embodiment, only the division of the above functional modules is used for illustration, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the semantic division processing device for data provided in the above embodiment and the semantic division processing method for data are the same conception, and detailed implementation process of the semantic division processing device is referred to as method embodiment, and will not be described herein.
The semantic division processing device of the data in the embodiment of the disclosure may be a virtual device, or may be a component, an integrated circuit or a chip in a server or a terminal. The device may be a mobile electronic device or a non-mobile electronic device. By way of example, the mobile electronic device may be a cell phone, tablet computer, notebook computer, palm top computer, vehicle mounted electronic device, wearable device, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook or personal digital assistant (personal digital assistant, PDA), etc., and the non-mobile electronic device may be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., and the embodiments of the disclosure are not limited in particular.
The semantic division processing device of data in the embodiments of the present disclosure may be a device having an operating system. The operating system may be an Android operating system, an ios operating system, or other possible operating systems, and the embodiments of the present disclosure are not limited specifically.
The semantic segmentation processing device for data provided in the embodiments of the present disclosure can implement each process implemented by the embodiments of the method of fig. 1 to 5, and in order to avoid repetition, a description is omitted here.
Optionally, as shown in fig. 7, the embodiment of the present disclosure further provides a computing device 700, including a processor 701, a memory 702, and a program or an instruction stored in the memory 702 and capable of being executed on the processor 701, where the program or the instruction implements each process of the embodiment of the semantic segmentation processing method of data described above when executed by the processor 701, and the same technical effects are achieved, and for avoiding repetition, a description is omitted herein. It should be noted that, the computing device in the embodiments of the present disclosure includes the mobile electronic device and the non-mobile electronic device described above.
Fig. 8 is a schematic diagram of a hardware architecture of a computing device implementing an embodiment of the present disclosure.
The computing device 800 includes, but is not limited to: radio frequency unit 801, network module 802, audio output unit 803, input unit 804, sensor 805, display unit 806, user input unit 807, interface unit 808, memory 809, and processor 810.
Those skilled in the art will appreciate that the computing device 800 may also include a power source (e.g., a battery) for powering the various components, which may be logically connected to the processor 810 by a power management system to perform functions such as managing charge, discharge, and power consumption by the power management system. The computing device structure shown in fig. 8 is not limiting of the computing device, and the computing device may include more or less components than illustrated, or may combine certain components, or a different arrangement of components, which are not described in detail herein.
It should be appreciated that in embodiments of the present disclosure, the input unit 804 may include a graphics processor (Graphics Processing Unit, GPU) 8041 and a microphone 8042, the graphics processor 8041 processing image data of still pictures or video obtained by an image capturing apparatus (such as an image capturing device) in a video capturing mode or an image capturing mode. The display unit 806 may include a display panel 8061, and the display panel 8061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 807 includes a touch panel 8071 and other input devices 8072. Touch panel 8071, also referred to as a touch screen. The touch panel 8071 may include two parts, a touch detection device and a touch controller. Other input devices 8072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein. The memory 809 may be used to store software programs as well as various data including, but not limited to, application programs and an operating system. The processor 810 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 810.
The embodiment of the present disclosure further provides a readable storage medium, where a program or an instruction is stored, where the program or the instruction implements each process of the semantic segmentation processing method embodiment of data when executed by a processor, and the process can achieve the same technical effect, so that repetition is avoided, and no further description is given here.
Wherein the processor is a processor in the computing device in the above embodiments. Readable storage media include computer readable storage media such as Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic or optical disks, and the like.
The embodiment of the disclosure further provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction, implement each process of the foregoing data semantic segmentation processing method embodiment, and achieve the same technical effect, so that repetition is avoided, and no further description is given here.
It should be understood that the chips referred to in the embodiments of the present disclosure may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present disclosure is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solutions of the present disclosure may be embodied essentially or in part in the form of a computer software product stored on a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) including instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods of the various embodiments of the present disclosure.
The embodiments of the present disclosure have been described above with reference to the accompanying drawings, but the present disclosure is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those of ordinary skill in the art without departing from the spirit of the disclosure and the scope of the claims, which are all within the protection of the present disclosure.

Claims (9)

1. A semantic division processing method of data, comprising:
obtaining to-be-processed data of a target scene, wherein the to-be-processed data is laser point information obtained by scanning a surface of a target object through a laser beam according to a preset track and reflecting the laser beam; the information carried by the laser point comprises: azimuth and distance;
dividing the data to be processed to obtain a plurality of data blocks to be semantically divided;
inputting a plurality of data blocks into a context coupling semantic segmentation model to carry out segmentation processing to obtain a plurality of segmentation results, wherein the context coupling semantic segmentation model comprises context sampling interpolation branches and direct sampling branches with different scale characteristics;
performing splicing treatment on the multiple segmentation results to obtain a final semantic segmentation result of the target scene;
inputting the data blocks into a context coupling semantic segmentation model for segmentation processing to obtain a plurality of segmentation results, wherein the method comprises the following steps:
reconstructing the surface features of the data blocks to obtain local spatial features;
inputting the local spatial features into a plurality of first encoders in a context sampling interpolation branch of the context coupling semantic segmentation model for processing to obtain a plurality of first encoding processing results;
Inputting the local spatial features into a plurality of second encoders in a direct sampling branch of the context coupling semantic segmentation model for processing to obtain a plurality of second encoding processing results; the first encoder in the context sampling interpolation branch is different from the second encoder in the direct sampling branch in coding scale characteristics;
and obtaining a plurality of segmentation results according to a plurality of the first coding processing results and a plurality of the second coding processing results.
2. The method for semantic segmentation processing of data according to claim 1, wherein acquiring data to be processed of a target scene comprises:
reading vertexes of the OSGB (object scene oriented oblique scanning data) grid;
extracting vertex coordinates of triangular faces in the grid to form sparse data;
proportional contraction is carried out on the vertexes of the triangular surface in the sparse data, and coordinates of new vertexes are obtained;
and carrying out two-dimensional texture coordinate mapping on the coordinates of the new vertexes on the triangular surface to obtain data to be processed.
3. The method for semantic segmentation processing of data according to claim 1, wherein the segmentation processing is performed on the data to be processed to obtain a plurality of data blocks to be semantically segmented, comprising:
Semantic annotation is carried out on the data to be processed to obtain annotated data to be processed;
and dividing the marked data to be processed by using a preset standard area square block to obtain a plurality of data blocks to be semantically segmented.
4. The method according to claim 1, wherein obtaining a plurality of division results based on the plurality of first encoding processing results and the plurality of second encoding processing results, comprises:
respectively carrying out multi-scale coupling on n second coding processing results in the plurality of second coding processing results to obtain n coupling results;
respectively cascading the n coupling results with n first coding processing results to obtain n cascading results;
inputting the n cascade results and the encoding processing result which is not subjected to multi-scale coupling into a plurality of decoders respectively to decode and recover the density, so as to obtain a plurality of segmentation results; n is a positive integer.
5. The method for semantic segmentation processing of data according to claim 1, wherein performing multi-scale coupling on n second encoding processing results among the plurality of second encoding processing results, respectively, to obtain n coupling results, includes:
According to
Figure QLYQS_1
Respectively carrying out multi-scale coupling on the n second coding processing results to obtain n coupling results;
wherein ,
Figure QLYQS_2
sequences formed for n coupling results, +.>
Figure QLYQS_3
As a function of the non-linear transformation,
Figure QLYQS_4
,/>
Figure QLYQS_5
for a first one of the n second encoding process results,
Figure QLYQS_6
for the second one of the n second coding results,/for the second one of the n second coding results>
Figure QLYQS_7
Is the nth second encoding processing result.
6. The method for semantic segmentation processing of data according to claim 1, wherein performing a stitching process on a plurality of the segmentation results to obtain a final semantic segmentation result of the target scene, comprises:
and performing OR operation on the overlapped parts of the plurality of segmentation results to obtain a final semantic segmentation result of the target scene.
7. A semantic division processing apparatus for data, comprising:
the acquisition module is used for acquiring the data to be processed of the target scene;
the first processing module is used for carrying out segmentation processing on the data to be processed to obtain a plurality of data blocks to be subjected to semantic segmentation;
the second processing module is used for inputting the plurality of data blocks into a context coupling semantic segmentation model to be subjected to segmentation processing to obtain a plurality of segmentation results, wherein the context coupling semantic segmentation model comprises context sampling interpolation branches and direct sampling branches with different scale characteristics;
The third processing module is used for performing splicing processing on the plurality of segmentation results to obtain a final semantic segmentation result of the target scene;
inputting the data blocks into a context coupling semantic segmentation model for segmentation processing to obtain a plurality of segmentation results, wherein the method comprises the following steps:
reconstructing the surface features of the data blocks to obtain local spatial features;
inputting the local spatial features into a plurality of first encoders in a context sampling interpolation branch of the context coupling semantic segmentation model for processing to obtain a plurality of first encoding processing results;
inputting the local spatial features into a plurality of second encoders in a direct sampling branch of the context coupling semantic segmentation model for processing to obtain a plurality of second encoding processing results; the first encoder in the context sampling interpolation branch is different from the second encoder in the direct sampling branch in coding scale characteristics;
and obtaining a plurality of segmentation results according to a plurality of the first coding processing results and a plurality of the second coding processing results.
8. A computing device, comprising: a processor, a memory storing a computer program which, when executed by the processor, performs the method of any one of claims 1 to 6.
9. A computer readable storage medium storing instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1 to 6.
CN202310308128.3A 2023-03-28 2023-03-28 Semantic segmentation processing method, device and equipment for data Active CN116091778B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310308128.3A CN116091778B (en) 2023-03-28 2023-03-28 Semantic segmentation processing method, device and equipment for data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310308128.3A CN116091778B (en) 2023-03-28 2023-03-28 Semantic segmentation processing method, device and equipment for data

Publications (2)

Publication Number Publication Date
CN116091778A CN116091778A (en) 2023-05-09
CN116091778B true CN116091778B (en) 2023-06-20

Family

ID=86187141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310308128.3A Active CN116091778B (en) 2023-03-28 2023-03-28 Semantic segmentation processing method, device and equipment for data

Country Status (1)

Country Link
CN (1) CN116091778B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9953236B1 (en) * 2017-03-10 2018-04-24 TuSimple System and method for semantic segmentation using dense upsampling convolution (DUC)
CN114926636A (en) * 2022-05-12 2022-08-19 合众新能源汽车有限公司 Point cloud semantic segmentation method, device, equipment and storage medium
CN114972763B (en) * 2022-07-28 2022-11-04 香港中文大学(深圳)未来智联网络研究院 Laser radar point cloud segmentation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN116091778A (en) 2023-05-09

Similar Documents

Publication Publication Date Title
CN109300190B (en) Three-dimensional data processing method, device, equipment and storage medium
CN109977192B (en) Unmanned aerial vehicle tile map rapid loading method, system, equipment and storage medium
CN106709871B (en) Method and system for image composition using active masks
JP7273129B2 (en) Lane detection method, device, electronic device, storage medium and vehicle
WO2021027692A1 (en) Visual feature library construction method and apparatus, visual positioning method and apparatus, and storage medium
KR20210036319A (en) Method, apparatus and electronic device for identifying text content
CN113869138A (en) Multi-scale target detection method and device and computer readable storage medium
KR20220153667A (en) Feature extraction methods, devices, electronic devices, storage media and computer programs
CN114792355B (en) Virtual image generation method and device, electronic equipment and storage medium
CN116091778B (en) Semantic segmentation processing method, device and equipment for data
CN112085842B (en) Depth value determining method and device, electronic equipment and storage medium
US20190087988A1 (en) Table cell validation
CN116503347A (en) Method and device for detecting leakage oil of power system and training model and computer equipment
CN113610856B (en) Method and device for training image segmentation model and image segmentation
Yang et al. Substation meter detection and recognition method based on lightweight deep learning model
CN113361519B (en) Target processing method, training method of target processing model and device thereof
WO2022174517A1 (en) Crowd counting method and apparatus, computer device and storage medium
CN111078812B (en) Fence generation method and device and electronic equipment
Wang et al. Salient object detection with high‐level prior based on Bayesian fusion
Wu et al. Industrial equipment detection algorithm under complex working conditions based on ROMS R-CNN
CN112101252A (en) Image processing method, system, device and medium based on deep learning
CN111462121A (en) Image cropping method, system, device and medium based on image semantic understanding
CN112465692A (en) Image processing method, device, equipment and storage medium
CN114187408B (en) Three-dimensional face model reconstruction method and device, electronic equipment and storage medium
KR102504007B1 (en) Context vector extracting module generating context vector from partitioned image and operating method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant