CN117764998A - Image segmentation model with multichannel parallel input - Google Patents

Image segmentation model with multichannel parallel input Download PDF

Info

Publication number
CN117764998A
CN117764998A CN202311749684.0A CN202311749684A CN117764998A CN 117764998 A CN117764998 A CN 117764998A CN 202311749684 A CN202311749684 A CN 202311749684A CN 117764998 A CN117764998 A CN 117764998A
Authority
CN
China
Prior art keywords
image
channel
semantic
features
segmentation model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311749684.0A
Other languages
Chinese (zh)
Inventor
刁晓淳
王文瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Corelli Software Co ltd
Original Assignee
Shanghai Corelli Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Corelli Software Co ltd filed Critical Shanghai Corelli Software Co ltd
Priority to CN202311749684.0A priority Critical patent/CN117764998A/en
Publication of CN117764998A publication Critical patent/CN117764998A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention relates to a multi-channel parallel input image segmentation model which comprises an input image group, a network architecture and an output image, wherein the input image group is an image group formed by a plurality of different channel images, the output image is a segmentation result with the same resolution as the input image, and the network architecture comprises a feature analysis module, a multi-channel semantic synthesis module and a comprehensive analysis module. The image segmentation model input by the multichannel parallel connection can be combined with multiple types of images, so that defects can be effectively and comprehensively identified.

Description

Image segmentation model with multichannel parallel input
Technical Field
The invention relates to the field of image detection, in particular to a multi-channel parallel input image segmentation model.
Background
In the field of industrial automation quality inspection based on image detection algorithms, particularly in the manufacturing process of liquid crystal panels, it is often necessary to switch different types of light sources to image the same product. Meanwhile, imaging is needed for multiple times at the same position in different processes and different process sections of the same product, and finally, multiple different image states are combined to jointly confirm whether defects exist or not and whether the products are good products or not.
The defect that can be judged by combining a plurality of different states is often a defect that is important in a process section and affects the structural parameters and basic functions of the whole product. Such defects often have the characteristics of difficult positioning, difficult detection, difficult joint judgment, and the like. For the detection of such defects, the yield of the product is directly affected.
The image detection algorithm commonly used in the industry is a detection algorithm for single station, single shooting scene and single image, and is difficult to process multiple images at the same position in parallel or a corresponding method is used for processing different images once, so that logic summarization is performed on the result.
The existing detection modes are divided into two types, one is manual detection, namely images with multiple fields of view are displayed on a manual detection software interface at the same time, and the defect position is judged through comprehensive analysis by manual overview of all the images. And the other is to detect by using an image algorithm, independently calculate defect results by using the algorithm for each image, finally collect the defect results of all images, and comprehensively obtain the unique result of the defect according to the formula or according to the priority logic.
The disadvantage of the personnel operation scheme is that an operator needs to be matched with the detection station so as to respond to the defect identification requirement in real time, so that labor cost is wasted, and meanwhile, certain errors can be caused due to reasons of personnel fatigue, poor positioning effect and the like. The multi-channel defect is a defect which is judged by experience of a graph judging person in comparison. Such defects are greatly affected by personnel factors in the process of determination. For the same set of images, different operators can comprehensively judge according to personal experience, and the results are often inconsistent. When the existing field is manually determined, a graph determining person with abundant experience is usually required to be used as a graph determining group leader to perform the spot check and recheck on the manual detection result, so that great labor cost waste is caused. For partial defects with inconsistent judgment, corresponding yield problems such as over-killing and omission can also exist.
The visual detection method based on a single image has two major disadvantages, namely a plurality of defects, and the defect judgment is carried out by integrating multiple different images, so that the algorithm design can be carried out. However, the algorithm based on the single image cannot realize the function, and only the defect characteristic of the image can be found in the image, and the characteristic has limitation. Secondly, in the final defect synthesis link, the logic of defect synthesis is preset and solidified in the program. The adjustment cannot be made according to the actual situation, and the correctness of the preset rule and the correctness of the logic priority are to be questioned. Generally, the detection method of a single image still has corresponding yield problems such as over-killing and missing detection.
Therefore, how to effectively combine each photographing position, each different light source and the image under each process section to comprehensively judge the defects is one of the important difficulties in the image quality inspection industry.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a multi-channel parallel input image segmentation model which can be combined with multiple types of images to effectively and comprehensively identify defects.
In order to achieve the above purpose, the technical solution of the present invention is as follows:
the image segmentation model comprises an input image group, a network architecture and an output image, wherein the input image group is an image group formed by a plurality of different channel images, the output image is a segmentation result with the same resolution as the input image, and the network architecture comprises a feature analysis module, a multi-channel semantic synthesis module and a comprehensive analysis module.
As an optimal technical scheme, the feature analysis module performs independent calculation on each single-channel image, each single-channel image is respectively sent into each sub-network model according to high-low dimensional semantic features and image features, and finally one single-channel image can output two types of features, one is the high-low dimensional semantic features, and the other is the image features.
As a preferred technical solution, the high-low dimensional semantic feature is essentially a long-column feature vector, which contains high-low dimensional information representative of images, and the image feature is a multi-channel tensor.
As a preferable technical scheme, the multi-channel semantic synthesis module adopts twenty parallel semantic synthesis modules with the same function, three images are selected in the six-channel images as image features, three channels are used as high-low dimensional semantic features for calculation, and the obtained twenty semantic synthesis results are input to the comprehensive analysis module.
As an optimal technical scheme, the single semantic integration module consists of a plurality of SPADE sub-modules, and the SPADE sub-modules fuse high-low dimensional semantic features with image features in a mode of convolution and addition for a plurality of times.
As a preferable technical scheme, the comprehensive analysis module splices the characteristics of all channels into a large-size tensor, and performs dimension-reducing cavity convolution on the large-size tensor until a single-channel mask image is formed, namely, the result is output.
As an optimal technical scheme, the training process of the comprehensive analysis module is as follows: preparing a data set, wherein the data set is reserved with a plurality of sets of data, and one set of data comprises six measured object images in each view field and a semantic segmentation mask with a completed label; 70% of the data set is randomly selected as a training set, 20% is a testing set, and 10% is a verification set; in the single training process, the loss function in the training set is used as the training direction, the loss function in the test set is used as the parameter adjustment basis, and the final effect is characterized by verifying the loss function in the set.
As a preferable technical scheme, the objective function derived in the training process adopts an end-to-end semantic segmentation training mode, and is expressed as a difference loss function of a semantic segmentation mask.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a multi-channel parallel input image segmentation model, the number of pictures input by the model can be freely configured, and the model is not limited to a single piece or fixed pieces. The model thus has algorithmic capabilities that integrate multiple channels of different images. Meanwhile, as the image segmentation model based on the deep neural network is used, in the end-to-end labeling link, the model can automatically learn and acquire the characteristics of the same position in different images only by labeling the defect position in the image of one channel. And establishing correlations among multichannel images, high-dimensional features in current pixels of the images and the adjacent areas of the images through a convolutional neural network, and finally realizing correct judgment.
Drawings
FIG. 1 is a schematic diagram of an image segmentation model of a multi-channel parallel input of the present invention;
FIG. 2 is a schematic diagram of a feature analysis module in an image segmentation model of the multi-channel parallel input of the present invention;
FIG. 3 is a schematic diagram of a multi-channel semantic synthesis module in a multi-channel parallel input image segmentation model of the present invention;
FIG. 4 is a schematic diagram of the SPADENESblock module of FIG. 3;
FIG. 5 is a schematic diagram of the SPADE sub-module of FIG. 4;
FIG. 6 is a schematic diagram of a comprehensive analysis module in an image segmentation model of the multi-channel parallel input of the present invention.
Detailed Description
The technical scheme of the invention is further described below with reference to the specific embodiments:
as shown in FIG. 1, the image segmentation model with parallel multi-channel input comprises an input image group, a network architecture and an output image, wherein the input image group is an image group formed by a plurality of different channel images, the output image is a segmentation result with the same resolution as the input image, and the network architecture comprises a feature analysis module, a multi-channel semantic synthesis module and a comprehensive analysis module.
As shown in fig. 2, the feature analysis module performs independent calculation for each single-channel image, and sends each single-channel image into each sub-network model according to the high-low dimensional semantic feature and the image feature, and finally one single-channel image outputs two types of features, one is the high-low dimensional semantic feature and the other is the image feature. The high-low dimensional semantic features are essentially long-column feature vectors, contain some representative high-and low-dimensional information in the image, and are characterized by multi-channel tensors.
As shown in fig. 3, taking a six-channel input image as an example, after feature analysis of six single-channel images, six high-ground dimensions and features corresponding to the six single-channel images and six image features are obtained, and the two groups of images are selectively input to a later multi-channel semantic synthesis module.
The multi-channel semantic synthesis module adopts twenty parallel semantic synthesis modules with the same function, and the core function of the semantic synthesis module is to use the high-low dimensional characteristics of three images in the six-channel image to carry out semantic fusion on the image characteristics of the other three images so as to obtain a characteristic tensor fused with six images and different dimensional information.
As shown in fig. 4, the single semantic integration module is composed of a plurality of SPADE sub-modules, and as shown in fig. 5, the SPADE sub-modules fuse high-low dimensional semantic features with image features through a mode of multiple convolution and addition.
And inputting twenty semantic comprehensive results into a final comprehensive analysis link.
As shown in fig. 6, the comprehensive analysis module first splices the features of all channels into a large-size tensor, and performs dimension-reducing hole convolution on the large-size tensor until a single-channel mask image is formed, namely outputting the result.
The training process of the comprehensive analysis module is as follows: preparing a data set, wherein the data set is reserved with a plurality of sets of data, and one set of data comprises six measured object images in each view field and a semantic segmentation mask with a completed label; 70% of the data set is randomly selected as a training set, 20% is a testing set, and 10% is a verification set; in the single training process, the loss function in the training set is used as the training direction, the loss function in the test set is used as the parameter adjustment basis, and the final effect is characterized by verifying the loss function in the set. Because the end-to-end semantic segmentation training mode is adopted, the loss function is used as a derivative objective function in the training process and is expressed as a difference loss function of the semantic segmentation mask.
The present embodiment is further illustrative of the present invention and is not to be construed as limiting the invention, and those skilled in the art can make no inventive modifications to the present embodiment as required after reading the present specification, but only as long as they are within the scope of the claims of the present invention.

Claims (8)

1. The image segmentation model is characterized by comprising an input image group, a network architecture and an output image, wherein the input image group is an image group formed by a plurality of different channel images, the output image is a segmentation result with the same resolution as the input image, and the network architecture comprises a feature analysis module, a multi-channel semantic synthesis module and a comprehensive analysis module.
2. The image segmentation model of claim 1, wherein the feature analysis module performs independent calculation for each single-channel image, and sends each single-channel image into each sub-network model according to high-low dimensional semantic features and image features, and finally one single-channel image outputs two types of features, one is high-low dimensional semantic features and the other is image features.
3. The image segmentation model of claim 2, wherein the Gao Diwei semantic feature is essentially a long-column feature vector containing high-and low-dimensional information representative of images, the image feature being a multi-channel tensor.
4. The image segmentation model of multi-channel parallel input according to claim 2, wherein the multi-channel semantic synthesis module adopts twenty parallel semantic synthesis modules with the same function, three images are selected from six-channel images as image features, three channels are selected as high-low dimensional semantic features for calculation, and the twenty obtained semantic synthesis results are input to the comprehensive analysis module.
5. The multi-channel parallel input image segmentation model according to claim 4, wherein the single semantic synthesis module is composed of a plurality of SPADE sub-modules, and the SPADE sub-modules fuse high-low dimensional semantic features with image features in a mode of multiple convolution and addition.
6. The multi-channel parallel input image segmentation model according to claim 1, wherein the comprehensive analysis module splices the features of all channels into a large-size tensor, and performs dimension-reducing hole convolution on the large-size tensor until a single-channel mask image is formed, namely outputting a result.
7. The multi-channel parallel input image segmentation model of claim 6, wherein the training process of the comprehensive analysis module is as follows: preparing a data set, wherein a plurality of sets of data are reserved in the data set, and one set of data comprises six measured object images in each view field and a semantic segmentation mask with a completed label; 70% of the data set is randomly selected as a training set, 20% is a testing set, and 10% is a verification set; in the single training process, the loss function in the training set is used as the training direction, the loss function in the test set is used as the parameter adjustment basis, and the final effect is characterized by verifying the loss function in the set.
8. The multi-channel parallel input image segmentation model of claim 7, wherein the objective function derived during training is represented as a difference loss function of a semantic segmentation mask using an end-to-end semantic segmentation training model.
CN202311749684.0A 2023-12-19 2023-12-19 Image segmentation model with multichannel parallel input Pending CN117764998A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311749684.0A CN117764998A (en) 2023-12-19 2023-12-19 Image segmentation model with multichannel parallel input

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311749684.0A CN117764998A (en) 2023-12-19 2023-12-19 Image segmentation model with multichannel parallel input

Publications (1)

Publication Number Publication Date
CN117764998A true CN117764998A (en) 2024-03-26

Family

ID=90321482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311749684.0A Pending CN117764998A (en) 2023-12-19 2023-12-19 Image segmentation model with multichannel parallel input

Country Status (1)

Country Link
CN (1) CN117764998A (en)

Similar Documents

Publication Publication Date Title
US11449980B2 (en) System and method for combined automatic and manual inspection
CN109840900A (en) A kind of line detection system for failure and detection method applied to intelligence manufacture workshop
CN112966772A (en) Multi-person online image semi-automatic labeling method and system
CN111896549B (en) Building crack monitoring system and method based on machine learning
CN108416765A (en) A kind of character defect automatic testing method and system
CN114219799A (en) Intelligent manufacturing defective product analysis method and system
CN110020691A (en) LCD screen defect inspection method based on the training of convolutional neural networks confrontation type
CN114441452B (en) Optical fiber pigtail detection method
CN113189109A (en) Flaw judgment system and flaw judgment method based on artificial intelligence
CN115690234A (en) Novel optical fiber color line sequence detection method and system
CN114529510B (en) Automatic detection and classification method for cathode copper on-line quality
CN114895634A (en) Product production line automatic control system based on machine vision
CN113888472A (en) Detection method and equipment for consumer electronics defects
KR102524151B1 (en) Labeling learning method of artificial intelligence machine for smart products inspection
CN117764998A (en) Image segmentation model with multichannel parallel input
CN117455917A (en) Establishment of false alarm library of etched lead frame and false alarm on-line judging and screening method
CN113449767B (en) Multi-image fusion transformer substation equipment abnormity identification and positioning method
CN111127462A (en) Ready-made garment printing defect detection system and detection method based on multi-exposure imaging
CN114266725A (en) Equipment fault diagnosis system based on coaxial light path imaging
CN115546141A (en) Small sample Mini LED defect detection method and system based on multi-dimensional measurement
CN114219758A (en) Defect detection method, system, electronic device and computer readable storage medium
CN114154032A (en) Three-dimensional visual three-dimensional inspection method, system and device for transformer substation and storage medium
Supriya et al. Automatic Optical Inspection System for wiring harness using Computer Vision
CN110838107A (en) Method and device for intelligently detecting defects of 3C transparent component by variable-angle optical video
CN114511443A (en) Image processing, image recognition network training and image recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination