CN115861610A

CN115861610A - Improved CondInst-based sandstone aggregate image segmentation processing method

Info

Publication number: CN115861610A
Application number: CN202211461364.0A
Authority: CN
Inventors: 叶铱源; 吴栋; 刘兵; 梁筱; 陈海宁; 苏彬彬; 王苗苗
Original assignee: Hangzhou Maijian Intelligent Technology Co ltd
Current assignee: Hangzhou Maijian Intelligent Technology Co ltd
Priority date: 2022-11-17
Filing date: 2022-11-17
Publication date: 2023-03-28

Abstract

The invention relates to the technical field of image processing, in particular to a sand aggregate image segmentation processing method based on improved CondInst. The method comprises the steps of collecting and labeling a sand aggregate image, and establishing a sand aggregate image data set; preprocessing the sandstone aggregate image data set, and dividing a training set and a testing set; building an improved CondInst example segmentation model; setting network training parameters, and training and testing by using an improved CondInst algorithm model; and inputting the sand aggregate image newly shot by the camera into the trained improved CondInst algorithm model to be processed to obtain a sand aggregate image segmentation result. Compared with the original model, the method provided by the invention can segment the sand aggregate image at a higher speed and a higher accuracy by utilizing the fusion of the length and width characteristics and the improvement of the frame regression loss function.

Description

Improved CondInst-based sandstone aggregate image segmentation processing method

Technical Field

The invention relates to the technical field of image processing, in particular to a sand aggregate image segmentation processing method based on improved CondInst.

Background

With the continuous development of the infrastructure construction industries such as modern buildings, roads, bridges and the like, the sandstone aggregate is used as an indispensable important material in the infrastructure construction, and the technical indexes such as the gradation, the content of ultrastrong particles, the content of needle-shaped particles and the like can greatly influence the performance of the prepared concrete, so that the safety and the reliability of the buildings are influenced.

The detection of parameters such as sand aggregate gradation, supersson diameter particle content, needle flake particle content and the like mostly depends on manual detection by using instruments such as a stone sieve, an supersson diameter sieve, a needle flake calibrator, a vernier caliper and the like, the time consumption is long, the manual labor consumption is large, along with the rise of the machine vision industry, people gradually apply the sand aggregate image to the detection of technical indexes of the sand aggregate, the sand aggregate image is mainly segmented by an image segmentation technology, and the segmentation result is analyzed to obtain the parameters, so that the sand aggregate image segmentation algorithm has key influence on the accuracy of parameter analysis, and the precision and the reasoning speed have the key influence on whether the sand aggregate image can be put into practical application. Due to the complexity of the sandstone aggregate image, the traditional image segmentation method is not suitable, the segmentation problem of the sandstone aggregate image can be solved to a certain extent by utilizing a neural network model such as Mask _ Rcnn, but the segmentation speed and precision of the sandstone aggregate image still need to be researched in practical application.

Disclosure of Invention

Because the aggregate in the sand aggregate image is similar to the aggregate in color and the boundary between individual aggregates is not clear, the color distribution of single aggregate is not uniform, which brings great difficulty to the segmentation of the sand aggregate image. Therefore, due to the complexity of the sandstone aggregate image, the traditional image segmentation method is not applicable, and although the neural network models such as Mask _ Rcnn, UNet and the like solve the segmentation problem of the sandstone aggregate image to a certain extent, the segmentation speed and the segmentation precision of the neural network models cannot meet the practical application under certain conditions. In view of the above, in order to solve the problem, the present invention provides a sand-aggregate image segmentation processing method based on improved CondInst.

In order to realize the purpose, the invention provides the following technical scheme:

in a first aspect, the present invention provides a sand aggregate image segmentation processing method based on improved CondInst, including the following steps:

1) Collecting and labeling a sand aggregate image, and establishing a sand aggregate image data set;

2) Preprocessing the sandstone aggregate image data set, and dividing a training set and a testing set;

3) Building an improved CondInst example segmentation model;

4) Setting network training parameters, and training and testing by using an improved CondInst algorithm model;

5) And inputting the sand aggregate image newly shot by the camera into the trained improved CondInst algorithm model to be processed to obtain a sand aggregate image segmentation result.

As a further scheme of the present invention, in the step 1), when the sand aggregate image is collected, the method includes: paving the sandstone aggregate on a horizontal platform at will under different illumination and humidity conditions, and fixing an industrial camera above the horizontal platform for shooting; and marking the edge of the aggregate by using Labelme software after shooting is finished, marking smooth transition at the sheltered part, and establishing a sandstone aggregate image data set.

As a further aspect of the invention, in step 2), the sand aggregate image dataset is preprocessed using histogram equalization and Z-score normalization methods and the data is processed as per 9: a ratio of 1 randomly partitions the training set and the test set.

As a further aspect of the present invention, the histogram equalization process is to improve the contrast of the image by changing the gray scale distribution of the image, and specifically the following operations are performed: the method comprises the steps of firstly calculating the occurrence probability of each gray value in an image, arranging the gray values from small to large according to the occurrence probability, then adding the occurrence probability of one gray value and the occurrence probability of the gray value smaller than the gray value, namely the cumulative probability of the gray value, and finally modifying the gray value of each pixel point in the original image into the product of the cumulative probability of the gray value and the maximum gray value 255.

As a further scheme of the invention, the Z-score standardization method is to divide all pixel values of the sand aggregate image subjected to histogram equalization by the mean value of all pixel values after subtracting the mean value of all pixel values, so as to classify the sand aggregate image subjected to histogram equalization into a distribution with the mean value of 0 and the variance of 1, and obtain a sand aggregate image data set subjected to pretreatment.

As a further aspect of the present invention, in step 3), the improved connectinst algorithm model includes a backbone network, a segmentation mask generation branch, and a loss function, where a feature extraction part of the backbone network is based on a feature extraction part of an original connectinst algorithm model, and the feature extraction of the backbone network includes:

performing feature extraction by using a ResNet network, wherein the ResNet network comprises five stages, the size of an output feature map between two adjacent stages is reduced by one half, a feature pyramid with the size of the output feature map reduced sequentially is obtained by using the ResNet network, and output feature maps of the next three stages are taken for further feature fusion to form a new feature pyramid;

inputting each layer of the new characteristic pyramid into a shared head network, and inputting each layer into different output layers after passing through a shared convolution layer, wherein each output layer comprises a classification prediction layer, a frame prediction layer, a centrality prediction layer and a dynamic convolution parameter generation layer; and the classification prediction layer, the frame prediction layer and the centrality prediction layer are unchanged, and the dynamic convolution parameter generation layer is modified.

According to a further scheme of the invention, the number of the parameter matrix channels generated by the division mask generation branch and the dynamic convolution parameter generation layer is changed from 169 to 185, the improved CondInst algorithm model generates the division mask by utilizing the P3 layer of the characteristic pyramid, and the division mask is predicted through the P4 layer and the P5 layer of the characteristic pyramid.

As a further scheme of the invention, when the mask is divided to generate branches, the improved CondInst algorithm model operates three layers of fusion characteristic graphs of P3, P4 and P5, and the operation steps comprise:

using the convolution layer to reduce the number of channels of the feature map so as to extract features and reduce calculated amount;

overlapping the output of the positive sample with the length and width characteristics and the relative position characteristics of the positive sample obtained by the backbone network in the channel direction to form a fusion characteristic diagram of the mask branches;

the final segmentation mask is generated by inputting it into a mask header formed using the dynamic convolution parameters generated by the backbone network, where the corresponding dynamic convolution parameters are used for the different instances.

As a further aspect of the present invention, the loss function of the CondInst algorithm model includes a classification loss function, a frame regression loss function, a centrality loss function, and a segmentation mask loss function, that is:

L _overall ＝L _cls +L _reg +L _ctrness +L _mask

wherein L is _cls Representing the classification loss function, L _reg Represents the frame regression loss function, L _ctrness Representing the centrality loss function, L _mask Representing the segmentation mask loss function.

As a further scheme of the invention, in the CondInst algorithm model, a frame regression loss function in an original loss function is modified from an IOU loss function to a Focal-EIOU loss function, and the formula is as follows:

the IOU represents an intersection ratio, namely the ratio of the intersection and the union of the prediction boundary box and the target boundary box; gamma represents a parameter controlling the degree of inhibition of the outlier; b represents the center point of the prediction bounding box; b ^gt Representing a center point of the target bounding box; c represents the length of the diagonal line of the circumscribed rectangle of the predicted bounding box and the target bounding box; w represents the width of the prediction bounding box; w is a ^gt Representing the width of the target bounding box; c _w Representing the width of the circumscribed rectangle of the prediction bounding box and the target bounding box; h represents the height of the prediction bounding box; h is ^gt Represents the height of the target bounding box; c _h Representing the height of the circumscribed rectangle of the predicted bounding box and the target bounding box; ρ (x, y) represents the euclidean distance between the two.

As a further scheme of the present invention, the training and testing under the deep learning framework pytorch by using python language in step 4) includes:

before training, selecting and setting training parameters including the blocksize, the learning rate, the epoch and the optimizer;

and training after the training parameters are set, obtaining a sandstone aggregate image segmentation model after the training is finished, and inputting the test set into the model for testing.

As a further scheme of the invention, in the step 5), the sandstone aggregate image segmentation model which is well trained and tested is deployed at the background of the coarse aggregate parameter analysis platform, and the sandstone aggregate image to be processed and collected by the camera is input into the model to obtain the segmentation result of the sandstone aggregate image and displayed on a platform interface.

In a second aspect, in a further aspect provided by the present invention, there is provided a computer apparatus comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps of the method for processing sand aggregate image segmentation based on improved CondInst when loaded and executed.

In a third aspect, in a further aspect of the present invention, there is provided a storage medium storing a computer program, which when loaded and executed by a processor, implements the steps of the method for processing sand and stone aggregate image segmentation based on improved CondInst.

The technical scheme provided by the invention has the following beneficial effects:

according to the sand-aggregate image segmentation processing method based on the improved CondInst, provided by the invention, the masks are predicted by utilizing the multi-layer fusion characteristics of the characteristic pyramid in the CondInst model segmentation mask generation branch, and the position characteristics and the length and width characteristics are not only added into each layer of fusion characteristics, so that the model can more quickly judge whether each pixel point around the center of the model instance belongs to the current instance, and the mask prediction process is accelerated.

The frame regression loss function in the original model loss function is changed from the IOU loss function to the Focal-EIOU loss function, the loss function considers not only the overlapping area of the prediction boundary frame and the target boundary frame, but also the central point distance loss and the width and height loss between the prediction boundary frame and the target boundary frame, the gradient disappearance problem generated when the prediction boundary frame and the target boundary frame are not overlapped can be further relieved, the convergence of the model is accelerated, and meanwhile, the parameter for controlling the abnormal value inhibition degree is utilized, so that the good prediction boundary frame can obtain larger loss, and the regression precision of the boundary frame is improved.

In conclusion, compared with the original model, the method provided by the invention can segment the sand aggregate image at a higher speed and a higher accuracy by utilizing the fusion of the length and width characteristics and the improvement of the frame regression loss function.

These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiments described hereinafter. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention.

In the drawings:

fig. 1 is a flowchart of a sand aggregate image segmentation processing method based on improved CondInst according to an embodiment of the present invention.

Fig. 2 is an overall frame diagram of a modified CondInst algorithm model in the sand-aggregate image segmentation processing method based on modified CondInst according to an embodiment of the present invention.

Fig. 3 is an overall framework diagram of the original CondInst algorithm model.

Fig. 4 is a block diagram of a computer device corresponding to the sand-aggregate image segmentation processing method based on the improved CondInst according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In some of the flows described in the present specification and claims and in the above figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, with the order of the operations being indicated as 101, 102, etc. merely to distinguish between the various operations, and the order of the operations by themselves does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.

The technical solutions in the exemplary embodiments of the present invention will be clearly and completely described below with reference to the drawings in the exemplary embodiments of the present invention, and it is obvious that the described exemplary embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Because the aggregate in the sand aggregate image is similar to the aggregate in color and the boundary between individual aggregates is not clear, the color distribution of single aggregate is not uniform, which brings great difficulty to the segmentation of the sand aggregate image. Therefore, due to the complexity of the sandstone aggregate image, the traditional image segmentation method is not applicable, and although the neural network models such as Mask _ Rcnn and UNet solve the segmentation problem of the sandstone aggregate image to a certain extent, the segmentation speed and accuracy of the neural network models cannot meet the practical application under certain conditions.

In order to solve the problems, the invention provides a sand aggregate image segmentation processing method based on improved CondInst.

Specifically, the embodiments of the present application are further described below with reference to the accompanying drawings.

Referring to fig. 1, an embodiment of the present invention provides a sand-aggregate image segmentation processing method based on improved CondInst, which includes steps 1) to 5):

3) Constructing an improved CondInst instance segmentation model;

In step 1) of this embodiment, when gathering grit aggregate image, include: paving the sandstone aggregate on a horizontal platform at will under different illumination and humidity conditions, and fixing an industrial camera above the horizontal platform for shooting; and marking the edge of the aggregate by using Labelme software after shooting is finished, marking smooth transition at the sheltered part, and establishing a sandstone aggregate image data set.

In said step 2) of the present embodiment, the sand aggregate image data set is preprocessed using histogram equalization and Z-score normalization methods, and the sand aggregate image data set is preprocessed according to 9: a ratio of 1 randomly partitions the training set and the test set.

The histogram equalization process is to improve the contrast of an image by changing the gray scale distribution of the image, and specifically includes the following operations: the method comprises the steps of firstly calculating the occurrence probability of each gray value in an image, arranging the gray values from small to large according to the occurrence probability, then adding the occurrence probability of one gray value and the occurrence probability of the gray value smaller than the gray value, namely the cumulative probability of the gray value, and finally modifying the gray value of each pixel point in the original image into the product of the cumulative probability of the gray value and the maximum gray value 255.

The Z-score standardization method is to divide the average value of all pixel values of the sand aggregate image subjected to histogram equalization by the average value of all pixel values after the average value of all pixel values is subtracted from all pixel values of the sand aggregate image subjected to histogram equalization, so that the sand aggregate image subjected to histogram equalization is classified into a distribution with the average value of 0 and the variance of 1, and a pretreated sand aggregate image data set is obtained.

In the step 3) of this embodiment, the modified CondInst algorithm model includes a backbone network, a split mask generation branch, and a loss function, and an overall frame diagram of the modified CondInst algorithm model is shown in fig. 2. The feature extraction part of the backbone network is based on the feature extraction part of the original CondInst algorithm model, and as shown in the figure 3, the feature extraction of the backbone network comprises the following steps: firstly, a ResNet network is utilized to extract features, wherein the ResNet network comprises five stages, the size of an output feature map between two adjacent stages is reduced by one half, a feature pyramid with the size of the output feature map reduced sequentially is obtained by utilizing the ResNet network, the output feature maps of the next three stages are taken to carry out further feature fusion, namely, the output feature map of the fifth stage is convolved to obtain a P5 layer of a new feature pyramid, the P5 layer is added with the convolved result of the output feature map of the fourth stage to form a P4 layer of the new feature pyramid, a P3 layer of the new feature pyramid is obtained through similar operation, and meanwhile, the P5 layer is subjected to continuous two-step down-sampling operation to obtain a P6 layer and a P7 layer of the new feature pyramid, so that the new feature pyramid is formed.

Then inputting each layer of the new characteristic pyramid into a shared head network, and inputting each layer into different output layers after passing through a shared convolution layer, wherein each output layer comprises a classification prediction layer, a frame prediction layer, a centrality prediction layer and a dynamic convolution parameter generation layer; and the classification prediction layer, the frame prediction layer and the centrality prediction layer are unchanged, and the dynamic convolution parameter generation layer is modified.

The number of channels of the parameter matrix generated by the dynamic convolution parameter generation layer is modified from 169 to 185 due to the subsequent improvement of the branch generation of the segmentation mask.

Referring to FIG. 2, the parameter matrix generated by the dynamic convolution parameter generation layer serves as the mask FCN head of the mask full convolution header network

The channel number of the feature map is 10, from F _mask The eight-channel feature map generated after convolution and the length and width features of the positive sample, namely the w and h two channels, are stacked in the channel direction, and on the basis, the relative position features x and y of the positive sample are continuously stacked in the channel direction through subsequent improvement to obtain the ^ er containing 12 channels>

And (5) feature diagrams. Applying a list of parameters in the channel direction of the parameter matrix generated by the dynamic convolution parameter generation layer to the mask full convolution head network to generate a mask, wherein the mask full convolution head network mask FCN headThe size of the convolution kernel of one convolution layer is changed from 1 × 1 × 10 to 1 × 1 × 12, and the number of such convolution kernels is 8, so that finally, a row of parameters in the channel direction of the parameter matrix needs to be increased by 16, namely, the original 169 channels need to be increased to 185 channels.

When in modification, the feature map generated by the previous convolution is convolved by using 16 other convolution kernels to obtain the dynamic convolution parameters of 16 other channels.

In this embodiment, when the segmentation mask generates a branch, the modified CondInst algorithm model not only generates the segmentation mask by using the P3 layer of the feature pyramid, but also predicts the segmentation mask by using the P4 layer and the P5 layer of the feature pyramid, and performs the following operations on the three fused feature maps of P3, P4, and P5:

firstly, the number of channels of the convolutional layer reduced feature map is utilized to extract features and reduce calculated amount; then overlapping the output of the mask branch with the length and width characteristics and the relative position characteristics of the positive sample obtained by the backbone network in the channel direction to form a fusion characteristic diagram of the mask branch; finally, the segmentation mask is input into a mask head formed by dynamic convolution parameters generated by a backbone network to generate a final segmentation mask, wherein the corresponding dynamic convolution parameters are used for different instances.

As shown in fig. 2 and the dashed box at the lower right of fig. 3, the main network fusion feature map is further convolved to obtain an output F _mask Not only are the long and wide features of the positive sample stacked in the channel direction, but also their relative positional features are simultaneously stacked in the channel direction.

The specific processing flow of the segmentation mask generation branch to the data is as follows: firstly, the number of channels of the three-layer fusion feature map of P3, P4 and P5 is reduced by the convolution layer to further extract features and reduce calculated amount, and then the output F is output _mask Overlapping the length and width characteristics and the relative position characteristics of the positive sample obtained by the backbone network in the channel direction to form a fused characteristic diagram of the mask branches

Finally, the data is input into a mask formed by dynamic convolution parameters generated by a backbone networkThe film header generates the final segmentation mask with the corresponding dynamic convolution parameters for the different instances.

The relative position feature can be regarded as an implicit position code, the coordinate of the center point of the frame generated by the backbone network is defined as (0,0), and the other position coordinates are redefined according to the position, so that the relative position feature can be obtained.

In this embodiment, the loss functions of the CondInst algorithm model include a classification loss function, a bounding box regression loss function, a centrality loss function, and a segmentation mask loss function, that is:

L _overall ＝L _cls +L _reg +L _ctrness +L _mask

wherein L is _cls Representing the classification loss function, L _reg Represents the bounding box regression loss function, L _ctrness Representing the centrality loss function, L _mas k represents the segmentation mask loss function.

In this embodiment, in the CondInst algorithm model, the frame regression loss function in the original loss function is modified from the IOU loss function to a Focal-EIOU loss function, and the formula is as follows:

the IOU represents intersection and union ratio, namely the ratio of intersection and union of the prediction boundary box and the target boundary box; gamma represents a parameter controlling the degree of inhibition of the outlier; b represents the center point of the prediction bounding box; b ^gt Representing a center point of the target bounding box; c represents the length of the diagonal line of the circumscribed rectangle of the predicted bounding box and the target bounding box; w represents the width of the prediction bounding box; w is a ^gt Representing the width of the target bounding box; c _w Representing the width of the circumscribed rectangle of the prediction bounding box and the target bounding box; h represents the height of the prediction bounding box; h is ^gt Represents the height of the target bounding box; c _h Representing the height of the circumscribed rectangle of the predicted bounding box and the target bounding box; ρ (x, y) represents the euclidean distance between the two.

In this implementationIn the example, only for L _reg Redefined, the other three loss functions remain unchanged.

As shown in fig. 2, the implementation step of step 3) includes: after an input image is subjected to a series of operations such as convolution, pooling, down-sampling and up-sampling of a backbone network, a fused feature map pyramid is obtained and comprises five fused feature maps P3, P4, P5, P6 and P7, after the features of each fused feature map are further extracted through shared convolution pooling operation, classification branches, frame regression branches, centrality regression branches and dynamic convolution parameter generation branches are input, and then the corresponding category of an original image area mapped by each point in each fused feature map, the frame position of a positive sample, the centrality of the positive sample and the dynamic convolution parameters of each point are obtained.

Dividing the mask to generate branches, further convolving the three fused feature maps P3, P4 and P5 to obtain a feature map F _mask Characteristic diagram F _mas k and the length and width characteristics and the relative position characteristics of the positive sample obtained by the trunk network are overlapped in the channel direction to form a fused characteristic diagram of the mask branches

And finally, inputting the segmentation mask into a mask head formed by dynamic convolution parameters generated by a backbone network to generate a final segmentation mask.

When an input picture is segmented, the picture is input into a model which is tested in a training mode, and the model generates a segmentation mask corresponding to the picture.

In this embodiment, the training and testing under the deep learning framework pytorch by using the python language in the step 4) includes:

It should be noted that, when the loss of the training set calculated by the segmentation model according to the loss function is reduced to a small value and the average cross-over ratio of the segmentation results obtained after the model processing of the training set reaches a certain value under the set epoch, the model training is completed.

After the model is trained, inputting the test set into the trained model, comparing the average intersection ratio of the test set loss calculated by the segmentation model according to the loss function and the segmentation result obtained by processing the test set by the model with the training set result, and if the difference is not more than a specified value, indicating that the model is qualified in testing.

In the embodiment, in the step 5), the trained and tested sandstone aggregate image segmentation model is deployed at the background of the coarse aggregate parameter analysis platform, and the sandstone aggregate image to be processed, which is acquired by the camera, is input into the model to obtain the segmentation result of the sandstone aggregate image and then displayed on the platform interface.

The segmentation result is a segmentation image obtained by multiplying a segmentation mask by a pixel value of a certain color and then proportionally adding the multiplied value and the pixel value of the corresponding position of the original image.

It should be understood that although the steps are described above in a certain order, the steps are not necessarily performed in the order described. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, some steps of this embodiment may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.

Therefore, the mask is predicted by utilizing the multi-layer fusion features of the feature pyramid in the CondInst model segmentation mask generation branch, and the position features and the length and width features are fused in each layer of fusion features, so that the model can more quickly judge whether each pixel point around the center of the model instance belongs to the current instance, and the mask prediction process is accelerated.

In summary, the method can segment sand aggregate images at a faster speed and with higher accuracy than the original model by using the fusion of the length and width features and the improvement of the frame regression loss function.

In one embodiment, referring to fig. 4, in an embodiment of the present invention, there is further provided a computer device 1000, including at least one processor 1002, and a memory 1001 communicatively connected to the at least one processor 1002, where the memory 1001 stores instructions executable by the at least one processor 1002, and the instructions are executed by the at least one processor 1002 to cause the at least one processor 1002 to execute the method for processing sand aggregate image segmentation based on modified condlnst, where the processor 1002 executes the instructions to implement the steps in the method embodiments:

1) And collecting and marking the sand aggregate image, and establishing a sand aggregate image data set.

2) And preprocessing the sandstone aggregate image data set, and dividing a training set and a testing set.

3) And constructing an improved CondInst example segmentation model.

4) Setting network training parameters, and training and testing by using an improved CondInst algorithm model.

In the step 1), when the sandstone aggregate image is collected, paving the sandstone aggregate on a horizontal platform at will under different illumination and humidity conditions, and then fixing an industrial camera above the horizontal platform for shooting; and marking the edge of the aggregate by using Labelme software after shooting is finished, marking smooth transition at the sheltered part, and establishing a sandstone aggregate image data set.

In step 2), the sand aggregate image dataset was preprocessed using histogram equalization and Z-score normalization methods and the data was normalized as per 9: a ratio of 1 randomly partitions the training set and the test set.

In an exemplary embodiment of the present invention, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, the various aspects of the present invention may also be implemented in the form of a program product including program code for causing a terminal device to execute the method for processing segmentation of a sand aggregate image based on modified CondInst according to various exemplary embodiments of the present invention described in the section "exemplary method" above in this specification, when the program product is run on the terminal device, the method for processing segmentation of a sand aggregate image based on modified CondInst including:

3) And building an improved CondInst example segmentation model.

4) And setting network training parameters, and training and testing by using an improved CondInst algorithm model.

In an exemplary embodiment of the present invention, a program product for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this respect, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory.

The above description is intended to be illustrative of the preferred embodiment of the present invention and should not be taken as limiting the invention, but rather, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

Claims

1. A sand aggregate image segmentation processing method based on improved CondInst is characterized by comprising the following steps:

3) Constructing an improved CondInst instance segmentation model;

5) And inputting the sandstone aggregate image newly shot by the camera into the trained improved CondInst algorithm model to obtain a sandstone aggregate image segmentation result.

2. The method for processing the sand-aggregate image segmentation based on the improved CondInst as claimed in claim 1, wherein the step 1) of collecting the sand-aggregate image comprises the following steps: paving the sandstone aggregate on a horizontal platform at will under different illumination and humidity conditions, and fixing an industrial camera above the horizontal platform for shooting; and marking the edge of the aggregate by using Labelme software after shooting is finished, marking smooth transition at the sheltered part, and establishing a sandstone aggregate image data set.

3. The CondInst-based sand-aggregate image segmentation processing method of claim 2, wherein in step 2), the sand-aggregate image dataset is preprocessed using histogram equalization and Z-score normalization methods and the sand-aggregate image data is processed according to 9: a ratio of 1 randomly divides the training set and the test set.

4. The method for processing sand and stone aggregate image segmentation based on improved CondInst as claimed in claim 1, wherein in step 3), the improved CondInst algorithm model comprises a trunk network, a segmentation mask generation branch and a loss function, the feature extraction part of the trunk network is based on the feature extraction part of the original CondInst algorithm model, and the feature extraction of the trunk network comprises:

5. The CondInst-based sand-aggregate image segmentation processing method as claimed in claim 4, wherein, the segmentation mask generation branch, the number of the parameter matrix channels generated by the dynamic convolution parameter generation layer is changed from 169 to 185, the CondInst algorithm model after improvement uses the P3 layer of the feature pyramid to generate the segmentation mask, and the segmentation mask is predicted through the P4 layer and the P5 layer of the feature pyramid.

6. The CondInst-based sand aggregate image segmentation processing method of claim 5, wherein when the segmentation mask generates branches, the CondInst algorithm model after the modification operates on the P3, P4 and P5 three-layer fusion feature maps, and the operation steps include:

7. The CondInst-based sand aggregate image segmentation processing method of claim 6, wherein the CondInst algorithm model includes classification loss function, bounding box regression loss function, centrality loss function and segmentation mask loss function, namely:

L _overall ＝L _cls +L _reg +L _ctrness +L _mask

wherein L is _cls Representing the classification loss function, L _reg Represents the bounding box regression loss function, L _ctrness Representing the centrality loss function, L _mask Representing the segmentation mask loss function.

8. The CondInst-based sand-aggregate image segmentation processing method of claim 7, wherein in the CondInst algorithm model, the border regression loss function in the original loss function is modified from the IOU loss function to a Focal-EIOU loss function, and the formula is as follows:

the IOU represents intersection and union ratio, namely the ratio of intersection and union of the prediction boundary box and the target boundary box; γ represents a parameter that controls the degree of inhibition of the outlier; b represents the center point of the prediction bounding box; b ^gt Representing a center point of the target bounding box; c represents the length of the diagonal line of the circumscribed rectangle of the predicted bounding box and the target bounding box; w represents the width of the prediction bounding box; w is a ^gt Representing the width of the target bounding box; c _w Representing the width of a circumscribed rectangle of the prediction bounding box and the target bounding box; h represents the height of the prediction bounding box; h is ^gt Represents the height of the target bounding box; c _h Representing the height of the circumscribed rectangle of the predicted bounding box and the target bounding box; ρ (x, y) represents the euclidean distance between the two.

9. The CondInst-based sand aggregate image segmentation processing method of claim 1, wherein the step 4) of training and testing under the deep learning framework pytorch by using python language comprises:

and training after the training parameters are set, obtaining a sandstone aggregate image segmentation model after the training is finished, and inputting a test set into the model for testing.

10. The CondInst-based sand aggregate image segmentation processing method as claimed in claim 1, wherein in step 5), the sand aggregate image segmentation model subjected to the training test is deployed at the background of a coarse aggregate parameter analysis platform, and the segmentation result of the sand aggregate image obtained after the sand aggregate image to be processed collected by a camera is input into the model is displayed on a platform interface.