CN114782459B - Spliced image segmentation method, device and equipment based on semantic segmentation - Google Patents

Spliced image segmentation method, device and equipment based on semantic segmentation Download PDF

Info

Publication number
CN114782459B
CN114782459B CN202210701199.5A CN202210701199A CN114782459B CN 114782459 B CN114782459 B CN 114782459B CN 202210701199 A CN202210701199 A CN 202210701199A CN 114782459 B CN114782459 B CN 114782459B
Authority
CN
China
Prior art keywords
image
segmentation
spliced
information
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210701199.5A
Other languages
Chinese (zh)
Other versions
CN114782459A (en
Inventor
张翡
高依铨
邓富城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Jijian Technology Co ltd
Original Assignee
Shandong Jivisual Angle Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Jivisual Angle Technology Co ltd filed Critical Shandong Jivisual Angle Technology Co ltd
Priority to CN202210701199.5A priority Critical patent/CN114782459B/en
Publication of CN114782459A publication Critical patent/CN114782459A/en
Application granted granted Critical
Publication of CN114782459B publication Critical patent/CN114782459B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a spliced image segmentation method, a spliced image segmentation device and spliced image segmentation equipment based on semantic segmentation, relates to the technical field of image data processing, and is used for improving the robustness and reliability of spliced image segmentation. The spliced image segmentation method comprises the following steps: inputting an obtained spliced image into a pre-trained semantic segmentation model, wherein the spliced image comprises a single image, determining segmentation labels of the spliced image through the semantic segmentation model, extracting outline information of the segmentation labels, outputting outline points of the segmentation labels according to the outline information, wherein the outlines of the segmentation labels are composed of a plurality of outline points, calculating a minimum circumscribed rectangular frame of the outline points of the segmentation labels, obtaining a single image target frame, determining position information and width and height information of the single image target frame, inputting the position information and the width and height information into the pre-trained prediction model, determining a splicing mode of the spliced image through the prediction model, and segmenting the spliced image according to the splicing mode.

Description

Spliced image segmentation method, device and equipment based on semantic segmentation
Technical Field
The present application relates to the field of image data processing technologies, and in particular, to a method, an apparatus, and a device for segmenting a stitched image based on semantic segmentation.
Background
In the field of video monitoring of the traffic industry, a plurality of images need to be taken to judge whether the vehicle is in violation of regulations, and the taken images can be spliced to form a large picture, so that the manual auditing and judging are facilitated. However, with the continuous development of artificial intelligence, the artificial intelligence technology gradually replaces the manual work of auditing and judging, and in the traffic violation auditing algorithm, the first step is to segment the spliced composite picture into single pictures.
In the prior art, the segmentation of the stitched image generally adopts a traditional image segmentation method, and the traditional image segmentation method is biased to segment the stitched image through the change of pixel points in the image. For example, the edge information of the single-image in the stitched image is determined by detecting the pixel points with severe light and shade change in the stitched image, that is, the pixel points with large gradient change, and then the stitched image is segmented according to the edge information.
When the image shooting equipment shoots an image, due to the influence of environmental factors around a road surface, the shot image may have uneven illuminance or noise, so that the quality of the spliced image is affected, and therefore, when the spliced images are segmented by using a traditional method, the condition of inaccurate detection of edge information occurs, the segmentation result of the final spliced image may be wrong, and the robustness is low.
Disclosure of Invention
In order to solve the technical problem, the application provides a method, a device and equipment for segmenting a spliced image based on semantic segmentation, which are used for improving the robustness and reliability of segmentation of the spliced image.
The first aspect of the application provides a spliced image segmentation method based on semantic segmentation, which comprises the following steps:
inputting an obtained spliced image into a pre-trained semantic segmentation model, wherein the spliced image comprises a single image;
determining segmentation labels of the spliced image through the semantic segmentation model;
extracting contour information of the segmentation labels;
outputting contour points of the segmentation labels according to the contour information, wherein the contours of the segmentation labels are composed of a plurality of contour points;
calculating a minimum circumscribed rectangle frame of the contour points of the segmentation labels, and obtaining a single-image target frame, wherein the single-image target frame is a frame of an area occupied by each single image in the spliced image;
determining position information and width and height information of the single-image target frame;
inputting the position information and the width and height information into a pre-trained prediction model;
determining a splicing mode of the spliced images through the prediction model;
and segmenting the spliced image according to the splicing mode.
Optionally, the semantic segmentation model is obtained by:
building an initial semantic segmentation model;
acquiring a first sample splicing map, wherein the first sample splicing map contains splicing image label information;
inputting the first sample mosaic into the initial semantic segmentation model;
extracting features in the first sample splicing image to obtain a first image feature image;
performing feature fusion on the first image feature map, and outputting a fused second image feature map;
processing the second image feature map to obtain a sample segmentation label;
performing first loss value calculation on the sample segmentation label to generate first loss value variation data, wherein the first loss value variation data is a first loss value data collection counted during each training of the initial semantic segmentation model;
and when the first loss value change data reach a preset condition, obtaining the semantic segmentation model.
Optionally, the initial semantic segmentation model includes a lightweight network, a feature pyramid network, a segmentation head, and a classification algorithm;
the lightweight network is used as an encoder to extract image features of the spliced image;
the characteristic pyramid network is used for extracting and fusing characteristics with different spatial resolutions in the image characteristics extracted by the lightweight network so as to extract more image characteristic information;
the segmentation head is used for determining final features from a plurality of image feature information extracted from a feature pyramid network, the classification algorithm is used for determining classification categories of the spliced image, the segmentation head comprises a convolution layer, an upper sampling layer and an activation function layer sigmoid, and the classification algorithm comprises a pooling layer, a full connection layer and an activation function layer sigmoid.
Optionally, the performing feature fusion on the first image feature map, and outputting a fused second image feature map includes:
the feature pyramid network is used as a decoder and is used for fusing the first image feature map output by the lightweight network to output a second image feature map;
the processing the second image feature map to obtain a sample segmentation label includes:
and inputting the second image feature map into the convolutional layer, the upsampling layer and the activation function layer sigmoid to obtain the sample segmentation label, wherein the sample segmentation label is an area occupied by a single map in the second image feature map.
Optionally, the prediction model is obtained by:
building an initial prediction model;
acquiring a second sample mosaic, wherein the second sample mosaic comprises position information and width and height information of each single picture and a mosaic mode of the second sample mosaic;
inputting the second sample mosaic into the initial prediction model;
performing second loss value calculation on the second sample splicing graph according to a preset loss function to generate second loss value change data, wherein the second loss value change data is a loss value data collection counted during each time of training the initial prediction model;
and when the second loss value change data reaches convergence, obtaining the prediction model.
Optionally, when the prediction model is trained, the single-pattern position information is used as training data, and the stitching pattern is used as a label for training.
Optionally, the extracting the contour information of the segmentation label includes:
carrying out binarization processing on the segmentation label to obtain a binarization segmentation label;
and carrying out contour extraction on the binary segmentation label by a hollowed internal point method to obtain the contour information.
The second aspect of the present application provides a mosaic image segmentation apparatus based on semantic segmentation, including:
the first input unit is used for inputting the obtained spliced image into a pre-trained semantic segmentation model, wherein the spliced image comprises a single image;
the first determining unit is used for determining the segmentation labels of the spliced image through the semantic segmentation model;
an extraction unit configured to extract contour information of the division label;
the output unit is used for outputting the contour points of the segmentation labels according to the contour information, and the contours of the segmentation labels are composed of a plurality of contour points;
the first calculation unit is used for calculating a minimum circumscribed rectangular frame of the contour points of the segmentation labels and obtaining a single-image target frame, wherein the single-image target frame is a frame of an area occupied by each single image in the spliced image;
the second determining unit is used for determining the position information and the width and height information of the single-image target frame;
the second input unit is used for inputting the position information and the width and height information into a pre-trained prediction model;
a third determining unit, configured to determine a stitching mode of the stitched image through the prediction model;
and the segmentation unit is used for segmenting the spliced image according to the splicing mode.
The third aspect of the present application provides a spliced image segmentation apparatus based on semantic segmentation, including:
the system comprises a central processing unit, a memory, an input/output interface, a wired or wireless network interface and a power supply;
the memory is a transient memory or a persistent memory;
the central processor is configured to communicate with the memory and execute the instructions in the memory to perform any of the aspects of the first aspect and alternatives thereof.
A fourth aspect of the present application provides a computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to carry out the method of the first aspect and any one of the alternatives of the first aspect.
According to the technical scheme, the embodiment of the application has the following advantages:
according to the method, the segmentation labels of the input spliced images are determined through a semantic segmentation model, the outline information of the segmentation labels is extracted after the segmentation labels are determined, the minimum circumscribed rectangle frame of the segmentation labels is determined, the single-image target frame is obtained, the pre-trained prediction model is input after the position information and the width and height information of the single-image target frame are determined, the splicing mode of the spliced images can be determined through the prediction model, and then the spliced images are segmented according to the splicing mode. The semantic segmentation is to understand an image from a pixel level, and can classify pixels belonging to the same class into one class, so that the problem of inaccurate segmentation caused by uneven illumination or noise in a spliced image when the spliced image is segmented is avoided, and the robustness and reliability of the segmentation of the spliced image are improved.
Drawings
In order to more clearly illustrate the technical solutions in the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a schematic diagram of an embodiment of a segmentation method for a stitched image based on semantic segmentation in the present application;
2-1 and 2-2 are diagrams illustrating another embodiment of a segmentation method for a spliced image based on semantic segmentation in the present application;
FIG. 3 is a schematic diagram of a semantic segmentation network component structure according to the present application;
FIG. 4 is a schematic diagram of a segmentation apparatus for semantic segmentation based stitched images according to the present application;
FIG. 5 is another schematic diagram of a segmentation apparatus for semantic segmentation based stitched images according to the present application;
fig. 6 is a schematic structural diagram of a stitched image segmentation apparatus based on semantic segmentation in the present application.
Detailed Description
The embodiment of the application provides a spliced image segmentation method, a spliced image segmentation device and spliced image segmentation equipment based on semantic segmentation, which are used for improving the robustness and reliability of spliced image segmentation.
In a traffic violation auditing algorithm, whether a vehicle has a specific illegal behavior needs to be judged through the auditing algorithm, but a group of composite pictures formed by splicing single pictures are usually transmitted into the auditing algorithm, so that the composite pictures need to be divided into the single pictures in the first step of the auditing algorithm. When segmenting a stitched image, conventional segmentation methods are generally used to segment the stitched image, for example: the edge information of the single image is detected by the traditional image processing technology, and the spliced image is segmented after the edge information of the single image is determined by the change of pixel points in the image, but the shot image may have uneven illuminance or noise, so that the detected position information of the single image is deviated or inaccurate, and the segmentation result of the spliced image is wrong. According to the method, the spliced image is segmented through the trained semantic segmentation model, the spliced image can be rapidly and accurately segmented, and the method has high robustness in traffic violation scenes.
The following briefly describes a segmentation method of a stitched image based on semantic segmentation in the present application:
referring to fig. 1, fig. 1 is an embodiment of a method for segmenting a stitched image based on semantic segmentation in the present application, where the method may be implemented on a device such as a mobile phone or a tablet, and for convenience of description, the following is specifically described by applying the method to a server, and includes:
101. the server inputs the obtained spliced image into a pre-trained semantic segmentation model, wherein the spliced image comprises a single image
In this embodiment, the server obtains a stitched image, the stitched image is formed by stitching a plurality of shot single images, and the stitched image may be in a plurality of stitching modes, for example, up 1 down 2, left 1 right 1, up 2 down 2, and the like, and is not limited herein. And after the spliced image is obtained, inputting the spliced image into a semantic segmentation model which is trained and trained in advance. Alternatively, before the stitched image is input into the pre-trained semantic segmentation model, the stitched image may be pre-processed, for example, pixel brightness conversion, gray level change, and the like, and then the pre-processed stitched image may be input into the pre-trained semantic segmentation model.
102. Server determines segmentation labels of spliced images through semantic segmentation model
In the image field, semantics refers to the content of an image, understanding the meaning of a picture, dividing means that different objects in the picture are divided from the perspective of pixels, and labeling each pixel in the picture, for example, in one image, a person and a background exist, all the pixels of the person can be labeled as one type through semantic division, and the pixels of the background can be labeled as one type. In the stitched image, the single images can be mainly labeled as one type, and the stitched background can be labeled as one type. A single image region, namely a region needing to be segmented in a spliced image, can be obtained through the semantic segmentation model, and the region is a segmentation label. Since the stitched image is formed by stitching a plurality of single images, for example, one stitched image is stitched by 3 single images, there are 3 regions to be segmented, and these 3 regions are all segmentation labels of the stitched image.
103. Server extracts contour information of segmentation label
In this embodiment, the contour is a curve formed by a series of connected points, and can represent a basic shape of an object, and can be used for graph analysis.
104. The server outputs contour points of the segmentation labels according to the contour information, and the contours of the segmentation labels are composed of a plurality of contour points
In the present embodiment, the contour is composed of a plurality of contour points, or may be regarded as a point set of a plurality of contour points, and after the contour of the division label is determined, the server determines each contour point of the division label and determines the coordinate information of the contour point.
105. The server calculates the minimum circumscribed rectangle frame of the contour points of the segmentation labels and obtains a single-image target frame, wherein the single-image target frame is the frame of the area occupied by each single image in the spliced image
Because the spliced image is formed by splicing a plurality of single images, a plurality of areas needing to be segmented exist, the areas are not connected, namely the segmentation labels are not connected, the contour points of each area are used as a group, and the contour points of each group are connected. Since the coordinate information of each contour point is already determined, the minimum bounding rectangle box of each set of contour points can be calculated by the minimum bounding rectangle algorithm. The minimum circumscribed rectangle refers to a plurality of two-dimensional shapes represented by two-dimensional coordinates, such as points, straight lines and the maximum range of a polygon, namely, a rectangle whose boundary is determined by the maximum abscissa, the minimum abscissa, the maximum ordinate and the minimum ordinate of each vertex of a given two-dimensional shape, the minimum external rectangular frame of each group of contour points is calculated, the obtained minimum external rectangular frame is determined as a single-picture target frame, and the single-picture target frame is a frame of an area occupied by each single picture in a spliced image. Optionally, the minimum bounding rectangle that does not satisfy the preset rule may also be filtered, and since other articles in the background may also be erroneously determined as the segmentation tag when determining the segmentation tag, filtering needs to be performed by the preset rule, where the preset rule may be a preset scale, a preset position, and specifically, no determination is made here.
106. The server determines the position information and the width and height information of the single-image target frame
In this embodiment, the server determines a central point of the minimum circumscribed rectangle frame, determines position information of the single-image target frame according to the central point, and determines width and height information of the single-image target frame according to the width and height of the minimum circumscribed rectangle frame.
107. The server inputs the position information and the width and height information into a pre-trained prediction model
In this embodiment, the server inputs the position information and the width and height information of the single image target frame into a pre-trained prediction model, the prediction model is a stitching mode prediction model, and a stitching mode of a stitched image can be output after the position information and the width and height information of the single image in the stitched image are obtained.
108. The server determines the splicing mode of the spliced images through the prediction model
In this embodiment, the prediction model finally determines a stitching mode of the stitched image, where the stitching mode includes a single image, top 1 and bottom 2, top 2 and bottom 1, left 1 and right 1, top 2 and bottom 2, top 1 and bottom 3, top 3 and bottom 1, and the like.
109. The server segments the spliced image according to the splicing mode
In this embodiment, after the splicing pattern is obtained, the spliced image can be segmented according to the segmentation pattern corresponding to the splicing pattern, and the obtained simple graph is input into a traffic violation auditing algorithm to determine whether the vehicle has a violation.
In the embodiment, the server inputs the acquired spliced image into a trained semantic segmentation model to obtain a segmentation label, determines a minimum circumscribed rectangular frame of the segmentation label to obtain a single-image target frame, inputs the single-image target frame into the trained prediction model after determining the position and width and height information of the single-image target frame, determines a splicing mode of the spliced image through the prediction model, and segments the spliced image according to the splicing mode, so that the spliced image can be segmented quickly and accurately, and the segmented image has high robustness in traffic violation scenes.
In this embodiment, an initial semantic segmentation model and an initial prediction model need to be built first, and the semantic segmentation model and the prediction model can be obtained after training, which will be described in detail below with reference to the accompanying drawings.
Referring to fig. 2-1, fig. 2-2 and fig. 3, fig. 2-1 and fig. 2-2 are another embodiment of the method for segmenting a stitched image based on semantic segmentation in the present application, and fig. 3 is a schematic structural diagram of an initial semantic segmentation model, and for convenience of description, the method is specifically described below as applied to a server, and includes:
201. server building initial semantic segmentation model
202. The server acquires a first sample splicing map, wherein the first sample splicing map contains label information of splicing images
203. The server inputs the first sample splicing picture into an initial semantic segmentation model
204. The server extracts the features in the first sample splicing picture to obtain a first image feature picture
205. The server performs feature fusion on the first image feature map and outputs a fused second image feature map
206. The server processes the second image feature map to obtain a sample segmentation label
207. The server calculates a first loss value of the sample segmentation label to generate first loss value change data, wherein the first loss value change data is a first loss value data collection set counted during each training of the initial semantic segmentation model
208. When the first loss value change data reach a preset condition, the server obtains a semantic segmentation model
In this embodiment, an initial semantic segmentation model needs to be built, and as shown in fig. 3, the initial semantic segmentation model includes a lightweight network (mobrienet _ v 2) 301, a Feature Pyramid network (Feature pyramids) 302, a segmentation head (segmentionhead) 303, and a classification algorithm (ClassificationHead) 304, where the lightweight network is used as an encoder to extract image features of a stitched image; the characteristic pyramid network is used for extracting and fusing characteristics with different spatial resolutions in the image characteristics extracted by the lightweight network so as to extract more image characteristic information; the segmentation head is used for determining final features from a plurality of image feature information extracted from the feature pyramid network, the classification algorithm is used for determining classification categories and confidence degrees of the spliced images, the segmentation head comprises a convolution layer, an up-sampling layer and an activation function layer sigmoid, and the classification algorithm comprises a pooling layer, a full-link layer and an activation function layer sigmoid.
When an initial semantic segmentation model is trained, a large number of samples need to be obtained for training, for example, fifty thousand first sample mosaic images are obtained, each first sample mosaic image contains label information, when training is performed, an original image and label information of the first sample mosaic image are input, the label information is a binary image, and only two values, namely 0 and 1, are obtained. The pixel value of the position belonging to the single image area is 1, the other positions are background, and the pixel value is 0. The lightweight network 301 is mainly used as an encoder for extracting image features of a first sample mosaic, after an image is input, the lightweight network 301 outputs first image feature maps of different scales through different convolution structures, the obtained first image feature maps of different scales are input into the feature pyramid network 302, the feature pyramid network 302 is used as a decoder for extracting features of images of each scale, feature maps of different scales output by the encoder of the lightweight network 301 are fused to obtain decoder output, and a fused second image feature map is output. Then, the segmentation head 303 performs convolution, upsampling and activating a function layer sigmoid on the feature map output by the decoder to obtain a segmentation label consistent with the scale of the input sample image. Since the number of classes to be divided is only one class of the single graph, one division label is output. And the Loss function in the training of the initial semantic mode adopts Dice Loss. The Dice coefficient is a set similarity measure function, and is generally used for calculating the similarity of two samples. And calculating the Dice by using a loss function calculation formula. If the Dice coefficient is larger, the set is more similar, the loss is smaller, and vice versa, and when the Dice coefficient reaches a preset value, the training of the semantic segmentation model is completed.
Optionally, during training of the semantic segmentation model, images of multiple scales may be randomly input for training, such as images with resolutions 416x416 and 512x512, where the specific scale is not limited here, so as to improve robustness of the semantic segmentation model for image scale transformation, and make the model less susceptible to image scale transformation.
209. The server inputs the obtained spliced image into a pre-trained semantic segmentation model, wherein the spliced image comprises a single image
210. The server determines the segmentation labels of the spliced images through the semantic segmentation model
Steps 209 to 210 in this embodiment are similar to steps 101 to 102 in the embodiment shown in fig. 1, and are not repeated here.
211. The server carries out binarization processing on the segmentation label to obtain a binarization segmentation label
212. The server extracts the contour of the binary segmentation label by a hollowed internal point method to obtain contour information
In this embodiment, the hollowed-out interior point method is generally used for extracting a binary image contour, and for a non-binary image, binarization processing needs to be performed first, so that a segmentation label needs to be binarized to obtain a binarized segmentation label. The image binarization is a process of setting the gray value of a pixel point on an image to be 0 or 255, namely, the whole image presents an obvious black and white effect. The method for hollowing out the internal points needs to traverse each pixel point in the image, if the gray value of the point is 0, the gray value of the point is assigned to be 0 no matter what the gray values of the 8 surrounding adjacent pixel points are; if the gray value of the point is 255 and the gray values of 8 adjacent pixels around the point are 255, assigning the gray value of the point to be 0; in addition to the above two cases, the point gray value is assigned to 255. After the above processing operations, the contour of the image can be obtained. Alternatively, the contour of the segmentation label can be determined by a boundary tracking method, and first, the non-binary image also needs to be subjected to binarization processing, a starting point is determined, and then, a next boundary point is searched according to a preset rule until the searching point coincides with the starting point, that is, the contour is found. Further, there are a region growing method, a region splitting and merging method, and the like, and a specific method of extracting a contour is not limited here.
213. The server outputs contour points of the segmentation labels according to the contour information, and the contours of the segmentation labels are composed of a plurality of contour points
214. The server calculates the minimum circumscribed rectangle frame of the contour points of the segmentation labels and obtains a single-image target frame, wherein the single-image target frame is the frame of the area occupied by each single image in the spliced image
215. The server determines the position information and the width and height information of the single-image target frame
Steps 213 to 215 in this embodiment are similar to steps 104 to 106 in the embodiment shown in fig. 1, and are not repeated herein.
216. Server building initial prediction model
217. The server acquires a second sample mosaic which contains the position information and the width and height information of each single picture and the mosaic mode of the second sample mosaic
218. The server inputs the second sample splicing map into the initial prediction model
219. The server performs second loss value calculation on the second sample splicing diagram according to a preset loss function to generate second loss value change data, wherein the second loss value change data is a loss value data collection set counted during each time of training the initial prediction model
220. When the second loss value change data reaches convergence, the server obtains a prediction model
In this embodiment, a large number of samples are required to train the initial prediction model, for example, sixty thousand second sample mosaic images may be obtained, each second sample mosaic image includes position information of each single image, that is, coordinate information of a vertex angle at the top left corner of a single image target frame, and width and height information of the single image target frame, the coordinate information and the width and height information are input by the model during training, the second sample mosaic image further includes a mosaic pattern of the second sample mosaic image, when the prediction model is trained, the position information of the single image is used as training data, the mosaic pattern is used as a label for training, and a regression model is used for fitting:
Figure 898002DEST_PATH_IMAGE001
loss function (least squares): the smaller the loss, the closer h (x) is to y (x), i.e., the closer the fit value is to the true value.
Figure 461095DEST_PATH_IMAGE002
The regression coefficients (weights θ) are unknown and need to be adjusted continuously to minimize the loss values. And during training, adjusting the weight theta by adopting a gradient descent method until convergence to obtain a prediction model.
221. The server inputs the position information and the width and height information into a pre-trained prediction model
222. The server determines the splicing mode of the spliced images through the prediction model
223. The server segments the spliced image according to the splicing mode
Steps 221 to 223 in this embodiment are similar to steps 107 to 109 in the embodiment shown in fig. 1, and are not described again here.
In the embodiment, the server trains the initial semantic segmentation model to obtain a semantic segmentation model, the server inputs the obtained spliced image into the trained semantic segmentation model to obtain a segmentation label, the contour information of the segmentation label is determined by a hollowed-out interior point method, the minimum circumscribed rectangle frame of the segmentation label is determined according to the contour information to obtain a single-image target frame, the server determines the position and the width and height information of the single-image target frame and then inputs the position and the width and height information into the trained prediction model, the splicing mode of the spliced image is determined through the prediction model, and then the spliced image is segmented according to the splicing mode.
The above description is about the segmentation method of the stitched image based on semantic segmentation, and the following description is about the segmentation device of the stitched image based on semantic segmentation:
referring to fig. 4, a segmentation apparatus for stitched images based on semantic segmentation in the present application includes:
a first input unit 401, configured to input an obtained stitched image into a pre-trained semantic segmentation model, where the stitched image includes a single image;
a first determining unit 402, configured to determine a segmentation label of the stitched image through a semantic segmentation model;
an extracting unit 403 for extracting contour information of the division label;
an output unit 404, configured to output contour points of a segmentation label according to the contour information, where the contour of the segmentation label is composed of a plurality of contour points;
the first calculation unit 405 is configured to calculate a minimum circumscribed rectangular frame of the contour points of the segmentation labels, and obtain a single-map target frame, where the single-map target frame is a frame of an area occupied by each single map in the stitched image;
a second determining unit 406, configured to determine position information and width and height information of the single-image target frame;
a second input unit 407, configured to input the position information and the width and height information into a pre-trained prediction model;
a third determining unit 408, configured to determine a stitching mode of the stitched image through the prediction model;
and a segmentation unit 409 for segmenting the stitched image according to the stitching mode.
In this embodiment, a first input unit 401 inputs an acquired stitched image into a trained semantic segmentation model, a first determination unit 402 determines a segmentation label, a first calculation unit 405 calculates a minimum circumscribed rectangle frame of the segmentation label to obtain a single-image target frame, a second determination unit 406 determines a position and width and height information of the single-image target frame, a second input unit 407 inputs the position and width and height information of the single-image target frame into the trained prediction model, a third determination unit 408 determines a stitching mode of the stitched image through the prediction model, and a segmentation unit 409 segments the stitched image according to the stitching mode, so that the stitched image can be segmented quickly and accurately, and has high robustness in a traffic violation scene.
Referring to fig. 5, another segmentation apparatus for a merged image based on semantic segmentation in the present application includes:
the first building unit 501 is used for building an initial semantic segmentation model;
a first obtaining unit 502, configured to obtain a first sample mosaic, where the first sample mosaic includes mosaic image label information;
a third input unit 503, configured to input the first sample mosaic into the initial semantic segmentation model;
a second extracting unit 504, configured to extract features in the first sample mosaic to obtain a first image feature map;
a fusion unit 505, configured to perform feature fusion on the first image feature map, and output a fused second image feature map;
the first processing unit 506 is configured to process the second image feature map to obtain a sample segmentation label;
a second calculating unit 507, configured to perform a first loss value calculation on the sample segmentation label to generate first loss value change data, where the first loss value change data is a first loss value data set counted during each training of the initial semantic segmentation model;
a first obtaining unit 508, configured to obtain a semantic segmentation model when the first loss value change data reaches a preset condition;
a first input unit 509, configured to input the obtained stitched image into a pre-trained semantic segmentation model, where the stitched image includes a single image;
a first determining unit 510, configured to determine a segmentation label of the stitched image through a semantic segmentation model;
the extraction unit 511 includes:
a second processing unit 5111, configured to perform binarization processing on the segmentation label to obtain a binarized segmentation label;
an extraction module 5112, configured to perform contour extraction on the binarized segmented label by using a hollowed-out interior point method to obtain contour information;
an output unit 512, configured to output contour points of the segmentation labels according to the contour information, where the contours of the segmentation labels are composed of a plurality of contour points;
a first calculating unit 513, configured to calculate a minimum circumscribed rectangular frame of the contour points of the segmentation labels, and obtain a single-map target frame, where the single-map target frame is a frame of an area occupied by each single map in the stitched image;
a second determining unit 514, configured to determine position information and width and height information of the single-map target frame;
a second building unit 515, configured to build an initial prediction model;
a second obtaining unit 516, configured to obtain a second sample mosaic, where the second sample mosaic includes position information and width and height information of each single map, and a mosaic pattern of the second sample mosaic;
a fourth input unit 517 for inputting the second sample mosaic into the initial prediction model;
a third calculating unit 518, configured to perform second loss value calculation on the second sample mosaic according to a preset loss function to generate second loss value change data, where the second loss value change data is a loss value data set counted during each training of the initial prediction model;
a second obtaining unit 519, configured to obtain a prediction model when the second loss value change data reaches convergence;
a second input unit 520, configured to input the position information and the width and height information into a pre-trained prediction model;
a third determining unit 521, configured to determine a stitching mode of the stitched image through the prediction model;
and a segmentation unit 522, configured to segment the stitched image according to the stitching mode.
In this embodiment, an initial semantic segmentation model is trained, a first obtaining unit 508 obtains a semantic segmentation model, a first input unit 509 inputs an obtained spliced image into the trained semantic segmentation model, a first determining unit 510 obtains segmentation labels, an extracting module 5112 determines contour information of the segmentation labels by a hollowed-out interior point method, a first calculating unit 513 determines a minimum circumscribed rectangular frame of the segmentation labels according to the contour information to obtain a single-image target frame, a second determining unit 514 determines positions and width and height information of the single-image target frame, a second input unit 520 inputs the positions and the width and height information of the single-image target frame into the trained prediction model, a third determining unit 521 determines a splicing mode of the spliced image through the prediction model, and a segmentation unit 522 segments the spliced image according to the splicing mode, so that the spliced image can be segmented rapidly and accurately, and has higher robustness in traffic violation scenes.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a stitched image segmentation apparatus based on semantic segmentation in the present application, including:
a central processing unit 602, a memory 601, an input/output interface 603, a wired or wireless network interface 604 and a power supply 605;
the memory 601 is a transient storage memory or a persistent storage memory;
the central processor 602 is configured to communicate with the memory 601 and execute the operation of instructions in the memory 601 to perform the steps in any of the embodiments shown in fig. 1-2-1 and 2-2.
The application provides a computer-readable storage medium, which comprises instructions that, when executed on a computer, cause the computer to execute a method corresponding to any one of the embodiments of fig. 1-2-1 and 2-2.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application, which are essential or part of the technical solutions contributing to the prior art, or all or part of the technical solutions, may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

Claims (10)

1. A spliced image segmentation method based on semantic segmentation is characterized by comprising the following steps:
inputting an obtained spliced image into a pre-trained semantic segmentation model, wherein the spliced image comprises a single image;
determining segmentation labels of the spliced image through the semantic segmentation model;
extracting contour information of the segmentation labels;
outputting contour points of the segmentation labels according to the contour information, wherein the contours of the segmentation labels are composed of a plurality of contour points;
calculating the minimum circumscribed rectangle frame of the contour points of the segmentation labels, and obtaining a single-image target frame, wherein the single-image target frame is a frame of an area occupied by each single image in the spliced image;
determining position information and width and height information of the single-image target frame;
inputting the position information and the width and height information into a pre-trained prediction model;
determining a splicing mode of the spliced images through the prediction model;
and segmenting the spliced image according to the splicing mode.
2. The stitched image segmentation method of claim 1, wherein the semantic segmentation model is obtained by:
building an initial semantic segmentation model;
acquiring a first sample splicing map, wherein the first sample splicing map contains label information of a splicing image;
inputting the first sample mosaic into the initial semantic segmentation model;
extracting features in the first sample splicing image to obtain a first image feature image;
performing feature fusion on the first image feature map, and outputting a fused second image feature map;
processing the second image feature map to obtain a sample segmentation label;
performing first loss value calculation on the sample segmentation label to generate first loss value variation data, wherein the first loss value variation data is a first loss value data collection counted during each training of the initial semantic segmentation model;
and when the first loss value change data reach a preset condition, obtaining the semantic segmentation model.
3. The stitched image segmentation method of claim 2, wherein the initial semantic segmentation model comprises a lightweight network, a feature pyramid network, a segmentation header, and a classification algorithm;
the lightweight network is used as an encoder for extracting image features of the spliced image;
the characteristic pyramid network is used for extracting and fusing characteristics with different spatial resolutions in the image characteristics extracted by the lightweight network so as to extract more image characteristic information;
the segmentation head is used for determining final features from a plurality of image feature information extracted from a feature pyramid network, the classification algorithm is used for determining classification categories of the spliced image, the segmentation head comprises a convolution layer, an upper sampling layer and an activation function layer sigmoid, and the classification algorithm comprises a pooling layer, a full connection layer and an activation function layer sigmoid.
4. The method for segmenting the stitched image according to claim 3, wherein the feature fusion of the first image feature map and the output of the fused second image feature map comprises:
the feature pyramid network is used as a decoder and is used for fusing the first image feature map output by the lightweight network to output a second image feature map;
the processing the second image feature map to obtain a sample segmentation label includes:
and inputting the second image feature map into the convolutional layer, the upsampling layer and the activation function layer sigmoid to obtain the sample segmentation label, wherein the sample segmentation label is an area occupied by a single map in the second image feature map.
5. The stitched image segmentation method of claim 1, wherein the prediction model is obtained by:
building an initial prediction model;
acquiring a second sample mosaic, wherein the second sample mosaic comprises position information and width and height information of each single image and a mosaic mode of the second sample mosaic;
inputting the second sample mosaic into the initial prediction model;
performing second loss value calculation on the second sample splicing diagram according to a preset loss function to generate second loss value change data, wherein the second loss value change data is a loss value data collection counted during each training of the initial prediction model;
and when the second loss value change data reaches convergence, obtaining the prediction model.
6. The method of segmenting a stitched image according to claim 5, wherein the training of the predictive model is performed by using the single-image position information as training data and the training of the stitching pattern as a label.
7. The stitched image segmentation method of any one of claims 1 to 6, wherein the extracting the contour information of the segmentation labels comprises:
carrying out binarization processing on the segmentation label to obtain a binarization segmentation label;
and carrying out contour extraction on the binary segmentation label by a hollowed internal point method to obtain the contour information.
8. A mosaic image segmentation device based on semantic segmentation is characterized by comprising:
the first input unit is used for inputting the obtained spliced image into a pre-trained semantic segmentation model, wherein the spliced image comprises a single image;
the first determining unit is used for determining the segmentation labels of the spliced image through the semantic segmentation model;
an extraction unit configured to extract contour information of the division label;
the output unit is used for outputting the contour points of the segmentation labels according to the contour information, and the contours of the segmentation labels are composed of a plurality of contour points;
the first calculation unit is used for calculating a minimum circumscribed rectangular frame of the contour points of the segmentation labels and obtaining a single-image target frame, wherein the single-image target frame is a frame of an area occupied by each single image in the spliced image;
the second determining unit is used for determining the position information and the width and height information of the single-image target frame;
the second input unit is used for inputting the position information and the width and height information into a pre-trained prediction model;
a third determining unit, configured to determine a stitching mode of the stitched image through the prediction model;
and the segmentation unit is used for segmenting the spliced image according to the splicing mode.
9. A spliced image segmentation device based on semantic segmentation is characterized by comprising:
the system comprises a central processing unit, a memory, an input/output interface, a wired or wireless network interface and a power supply;
the memory is a transient memory or a persistent memory;
the central processor is configured to communicate with the memory and execute the instruction operations in the memory to perform the stitched image segmentation method of any one of claims 1 to 7.
10. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the stitched image segmentation method of any one of claims 1 to 7.
CN202210701199.5A 2022-06-21 2022-06-21 Spliced image segmentation method, device and equipment based on semantic segmentation Active CN114782459B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210701199.5A CN114782459B (en) 2022-06-21 2022-06-21 Spliced image segmentation method, device and equipment based on semantic segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210701199.5A CN114782459B (en) 2022-06-21 2022-06-21 Spliced image segmentation method, device and equipment based on semantic segmentation

Publications (2)

Publication Number Publication Date
CN114782459A CN114782459A (en) 2022-07-22
CN114782459B true CN114782459B (en) 2022-08-30

Family

ID=82420253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210701199.5A Active CN114782459B (en) 2022-06-21 2022-06-21 Spliced image segmentation method, device and equipment based on semantic segmentation

Country Status (1)

Country Link
CN (1) CN114782459B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311601A (en) * 2020-03-26 2020-06-19 深圳极视角科技有限公司 Segmentation method and device for spliced image
CN111340837A (en) * 2020-02-18 2020-06-26 上海眼控科技股份有限公司 Image processing method, device, equipment and storage medium
CN111462140A (en) * 2020-04-30 2020-07-28 同济大学 Real-time image instance segmentation method based on block splicing
CN113096016A (en) * 2021-04-12 2021-07-09 广东省智能机器人研究院 Low-altitude aerial image splicing method and system
CN113362394A (en) * 2021-06-11 2021-09-07 上海追势科技有限公司 Vehicle real-time positioning method based on visual semantic segmentation technology

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11100366B2 (en) * 2018-04-26 2021-08-24 Volvo Car Corporation Methods and systems for semi-automated image segmentation and annotation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340837A (en) * 2020-02-18 2020-06-26 上海眼控科技股份有限公司 Image processing method, device, equipment and storage medium
CN111311601A (en) * 2020-03-26 2020-06-19 深圳极视角科技有限公司 Segmentation method and device for spliced image
CN111462140A (en) * 2020-04-30 2020-07-28 同济大学 Real-time image instance segmentation method based on block splicing
CN113096016A (en) * 2021-04-12 2021-07-09 广东省智能机器人研究院 Low-altitude aerial image splicing method and system
CN113362394A (en) * 2021-06-11 2021-09-07 上海追势科技有限公司 Vehicle real-time positioning method based on visual semantic segmentation technology

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Adapting Semantic Segmentation Models for Changes in Illumination and Camera Perspective;Wei Zhou etc.;《arXiv:1809.04730v1》;20180913;全文 *
交通监控视频图像语义分割及其拼接方法;刘嗣超等;《测绘学报》;20200415(第04期);全文 *
基于检测-分割的图像拼接篡改盲取证算法;杨超等;《电子设计工程》;20200705(第13期);全文 *

Also Published As

Publication number Publication date
CN114782459A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN109961049B (en) Cigarette brand identification method under complex scene
CN109615611B (en) Inspection image-based insulator self-explosion defect detection method
US11830230B2 (en) Living body detection method based on facial recognition, and electronic device and storage medium
KR101403876B1 (en) Method and Apparatus for Vehicle License Plate Recognition
CN106683119B (en) Moving vehicle detection method based on aerial video image
CN110866871A (en) Text image correction method and device, computer equipment and storage medium
CN108921120B (en) Cigarette identification method suitable for wide retail scene
CN109840483B (en) Landslide crack detection and identification method and device
CN110929593A (en) Real-time significance pedestrian detection method based on detail distinguishing and distinguishing
CN110751154B (en) Complex environment multi-shape text detection method based on pixel-level segmentation
CN112734761B (en) Industrial product image boundary contour extraction method
CN110717886A (en) Pavement pool detection method based on machine vision in complex environment
Xing et al. Traffic sign recognition using guided image filtering
CN113240623B (en) Pavement disease detection method and device
CN111695373A (en) Zebra crossing positioning method, system, medium and device
CN113989604A (en) Tire DOT information identification method based on end-to-end deep learning
CN112907626A (en) Moving object extraction method based on satellite time-exceeding phase data multi-source information
CN113688846A (en) Object size recognition method, readable storage medium, and object size recognition system
CN115661522A (en) Vehicle guiding method, system, equipment and medium based on visual semantic vector
CN115908774A (en) Quality detection method and device of deformed material based on machine vision
CN108022245A (en) Photovoltaic panel template automatic generation method based on upper thread primitive correlation model
Giri Text information extraction and analysis from images using digital image processing techniques
Harianto et al. Data augmentation and faster rcnn improve vehicle detection and recognition
CN114782459B (en) Spliced image segmentation method, device and equipment based on semantic segmentation
Lafuente-Arroyo et al. Traffic sign classification invariant to rotations using support vector machines

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 266000 F3, Jingkong building, No. 57 Lushan Road, Huangdao District, Qingdao, Shandong

Patentee after: Shandong Jijian Technology Co.,Ltd.

Address before: 266000 F3, Jingkong building, No. 57 Lushan Road, Huangdao District, Qingdao, Shandong

Patentee before: Shandong jivisual angle Technology Co.,Ltd.