CN114627297A - Image semantic segmentation method for irregular material of transfer learning - Google Patents

Image semantic segmentation method for irregular material of transfer learning Download PDF

Info

Publication number
CN114627297A
CN114627297A CN202210331353.4A CN202210331353A CN114627297A CN 114627297 A CN114627297 A CN 114627297A CN 202210331353 A CN202210331353 A CN 202210331353A CN 114627297 A CN114627297 A CN 114627297A
Authority
CN
China
Prior art keywords
image
irregular material
irregular
semantic segmentation
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210331353.4A
Other languages
Chinese (zh)
Inventor
曹东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Dongru Technology Co ltd
Original Assignee
Wuxi Dongru Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Dongru Technology Co ltd filed Critical Wuxi Dongru Technology Co ltd
Priority to CN202210331353.4A priority Critical patent/CN114627297A/en
Publication of CN114627297A publication Critical patent/CN114627297A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image semantic segmentation method of a transfer-learning irregular material, which judges a semantic segmentation task by using a random vector dynamic rendering model of an inferred image and a corresponding quasi-standard mark mask label. The method is based on widely available image data of the unlabeled irregular material to supplement a small-scale image semantic segmentation label data set, combines limited labeled image semantic segmentation data of the irregular material to train a model network, obtains the model generalization capability with excellent performance, can be effectively migrated to various industrial production scenes, obviously reduces the workload of manual labeling, and improves the efficiency of the method.

Description

Image semantic segmentation method for irregular material of transfer learning
Technical Field
The invention relates to an image semantic segmentation method for irregular materials in an intelligent factory production scene, and relates to the field of intelligent manufacturing and machine vision.
Background
There are a lot of industrial detection demands in industrial production scenes, and various industrial detection applications mainly focus on the attributes of geometric shape, size, spatial position, surface defect, edge detection and the like of a target detection object. Aiming at different application scenes in industrial production, the geometric appearance of the material needs to be detected in real time, the manual workload of industrial detection is huge, the input labor cost is huge, however, the effect of manual detection is not ideal, and the method comprises the steps of large subjective deviation, low efficiency, large detection quality fluctuation and incapability of guaranteeing the accuracy. Along with the gradual improvement of the intelligent level of industrial production, more and more industrial detection processes are realized by adopting an artificial intelligence algorithm, so that the intelligent manufacturing level can be obviously improved, and the intelligent factory construction can be powerfully promoted.
One of the main methods for detecting the attributes such as the geometric shape of the industrial material is to adopt an image semantic segmentation algorithm to realize intelligent detection. The target detection objects such as workpieces with regular geometric appearances have mature intelligent detection algorithms, but no uniform high-performance intelligent detection method exists for the target detection objects of irregular materials in production scenes, the market demand of various irregular production workpieces is huge, and a great deal of demands are made on image semantic segmentation. Irregular materials have no regular fixed geometric appearance, and bring many technical implementation challenges to detection work.
The existing semantic segmentation method comprises a U-type network algorithm (UNet, UNet + +), a V-type network algorithm (VNet) and a DeepLab algorithm (V1, V2 and V3), and has the problems that a large amount of labeling data is excessively depended on, and in the actual industrial production environment, high-quality effective training sets can be obtained only by manual labeling with high-skill experts with abundant field work experience, and the labeling data is often very expensive. The above semantic segmentation network based on the depth model needs data support, and usually needs training on a large-scale data set to achieve higher precision. Even if large data sets are available, there are many difficulties in generalizing the performance of U-type network algorithms (UNet, UNet + +), V-type network algorithms (VNet), and deep Lab algorithms (V1, V2, V3) networks to off-distribution data (such as images captured in other production scenarios).
Disclosure of Invention
The technical problem to be solved by the invention is as follows: from the aspect of image semantic segmentation, a semi-supervised method is adopted to perform transfer learning based on a large number of non-regular material sample images without labeling and a small number of labeled samples so as to solve the problem of geometric shape edge detection of small sample learning (small-scale labeled data set) non-regular materials.
The invention adopts the following technical scheme for solving the technical problems:
the invention provides an image semantic segmentation method for irregular materials, which comprises the following steps:
step S1, constructing a training data set, which comprises two subdata sets, one is an unlabeled irregular material image set, and the other is a labeled irregular material image set;
step S2, constructing a pixel level rendering module and training, inputting a vector form of a random variable during training, and outputting a semantic segmentation standard marking mask and an image; the input random variable generation process is obtained by converting random variable probability distribution of regular material image semantic space into new space distribution through a depth network, and the conversion standard is more in line with the image semantic distribution of irregular materials in a production scene;
step S3, constructing an irregular material rendering image discriminator to be applied to an irregular material real image and an irregular material rendering image in a production scene; constructing an irregular material rendering image-label pair discriminator for discriminating a rendered irregular material image-label pair and a real production scene irregular material image-label pair;
step S4, setting an image semantic manifold converter, mapping the image of the irregular material in the production scene to an embedded random variable to obtain an embedded enhanced space, and correspondingly rendering the semantic segmentation quasi-annotation mask label of the irregular material image by combining and marking the irregular material image; in the inference process, the embedding of a new image is calculated firstly, wherein the embedding refers to a vector representation;
s5, a target image of the irregular material is given, the target image is embedded into an embedded enhanced space of a pixel level rendering module, an image semantic manifold converter maps the target image to the embedded enhanced space, an inversion estimation value of an enhanced space vector is obtained through calculation, and an optimal pixel level label is obtained through reasoning;
step S6, learning and sensing image block similarity and Mahalanobis distance item per pixel of the irregular material target image are combined, and the inversion estimation value of the enhanced space vector is combined and transmitted back to the pixel level rendering module to obtain the image semantic segmentation result of the irregular material, namely: and (3) carrying out semantic segmentation on the target image and the quasi-annotation mask label.
Further, the image semantic segmentation method for irregular materials provided by the invention comprises the step of constructing a training data set in step S1, wherein the training data set comprises two sub data sets TaAnd TbWherein the data set TaIs a set of images of unmarked irregular material, Ta={u1,…,uJIn which ujRepresents each irregular material image, J is 1,2, …, J; data set TbIs an irregular material image set cleaned by data, and each image is labeled by semantic segmentation, Tb={(u1,v1),…,(uL,vL) In which ulRepresenting each irregular material image, vlThe representation corresponds to an image ulThe semantic segmentation of (1), 2, …, L, J > L.
Furthermore, the invention provides an image semantic segmentation method for irregular materials, wherein a pixel level rendering module in step S2 is provided with a residual jump connection design, 5 concatenated feature layers are sequentially arranged from an input end, and an additional branch is added to each feature layer to output a semantic segmentation quasi-label mask v and an image output u.
Furthermore, the image semantic segmentation method for the irregular material provided by the invention can be used for training
Figure BDA0003573183520000021
As an input to the process, the process may,
Figure BDA0003573183520000022
is a random noiseSpace, the space random variable is denoted as h, i.e.
Figure BDA0003573183520000023
Random variable h obeys a standard normal distribution
Figure BDA0003573183520000024
First converting p (h) into a new spatial distribution p (r) through the fully-connected network, the fully-convolutional network and the 1 × 1 convolutional layer, the space being represented as
Figure BDA0003573183520000031
The conversion criterion is that the spatial distribution p (r) better conforms to or approximates the image semantic distribution of irregular materials in the production scene of the intelligent factory, wherein the random variable
Figure BDA0003573183520000032
After affine transformation, these newly adjusted distributed noise variables r are fed to the feature layer of the pixel-level rendering module; outputting an image
Figure BDA0003573183520000033
And semantic segmentation quasi-annotation mask label
Figure BDA0003573183520000034
Wherein
Figure BDA0003573183520000035
Is the output image distribution space and is,
Figure BDA0003573183520000036
is the image semantic segmentation quasi-annotation mask label distribution space of irregular materials, and the form is
Figure BDA0003573183520000037
Furthermore, the invention provides an image semantic segmentation method of irregular materials and a pixel level rendering module
Figure BDA0003573183520000038
The objective function employed in the training is
Figure BDA0003573183520000039
In the form of
Figure BDA00035731835200000310
Wherein the sigmoid function sigmod (t) S (t) 1/(1+ e)-t) The training objective is to minimize the objective function by backpropagation gradient descent
Figure BDA00035731835200000311
Further, the image semantic segmentation method for the irregular material provided by the invention comprises the step of constructing an irregular material rendering image discriminator in step S3
Figure BDA00035731835200000312
Wherein
Figure BDA00035731835200000313
Applied to the real images of the irregular materials and the rendered images of the irregular materials in the production scene,
Figure BDA00035731835200000314
expressing a real number domain, exciting a pixel-level rendering module to construct a vivid image, and realizing the function based on a residual error structure; distinguishing device
Figure BDA00035731835200000315
The objective function employed in the training is
Figure BDA00035731835200000316
The form is as follows:
Figure BDA00035731835200000317
the training target is TongMaximizing objective function by over-back propagation gradient descent method
Figure BDA00035731835200000318
Further, the image semantic segmentation method for the irregular material provided by the invention comprises the step of constructing an irregular material rendering image-label pair discriminator in step S3
Figure BDA00035731835200000319
Wherein
Figure BDA00035731835200000320
The method comprises the steps of using input images and semantic segmentation quasi-labeling mask labels in a serial connection mode to judge rendered irregular material image-label pairs and real production scene irregular material image-label pairs, and adopting a multi-scale discriminator system architecture and a discriminator
Figure BDA00035731835200000321
The objective function employed in the training is
Figure BDA00035731835200000322
In the form of
Figure BDA00035731835200000323
The training objective is to maximize the objective function by the backpropagation gradient descent method
Figure BDA00035731835200000324
Further, the invention provides an image semantic segmentation method of irregular materials, and in step S4, the image semantic manifold converter
Figure BDA00035731835200000325
Extracting multi-level features using a feature pyramid network as a backbone, mapping these features to full convolution based networks
Figure BDA00035731835200000326
A space; training image semantic manifold converter
Figure BDA00035731835200000327
The pixel-level rendering module must be fixed
Figure BDA00035731835200000328
Model parameters of, i.e. this phase
Figure BDA00035731835200000329
Image semantic manifold converter without participating in training
Figure BDA00035731835200000330
Training an objective function
Figure BDA00035731835200000331
Is in the form of
Figure BDA00035731835200000332
Wherein
Figure BDA00035731835200000333
Expressing KL divergence, summing the pixel-by-pixel KL divergence losses for all pixels of the irregular material image, the first term representing the supervised loss, where the mathematical definition of KL divergence for discrete events
Figure BDA00035731835200000334
Figure BDA0003573183520000041
Wherein A and B represent two random variables respectively, and the corresponding probability distributions are PA(xi) And PB(xi) I represents a discrete random variable number; the sum of the last two terms of the objective function represents the unsupervised loss, wherein alpha is the value range [0,1 ] under the condition of irregular material image semantic distribution]Is disclosedThe parameters are set to be in a predetermined range,
Figure BDA0003573183520000042
an irregular material image rendering backbone representing a pixel-level rendering module,
Figure BDA0003573183520000043
marking mask label rendering branches on the irregular material image semantic segmentation standard representing the pixel-level rendering module;
Figure BDA0003573183520000044
learning and sensing image block similarity distance for measuring feature space of image network pre-training model depth residual error network
Figure BDA0003573183520000045
The distance between the first and second electrodes,
Figure BDA0003573183520000046
representing the mahalanobis distance between the data points x, y, wherein
Figure BDA0003573183520000047
Figure BDA0003573183520000048
Where Σ is the covariance matrix of the multidimensional random variable and T represents the transposition.
Furthermore, the invention provides an image semantic segmentation method for irregular materials, and the step S5 provides a target image of an irregular material in a production scene of an intelligent factory
Figure BDA0003573183520000049
Deriving optimal pixel-level labels
Figure BDA00035731835200000410
Firstly, the target image is embedded into an embedding space of a pixel-level rendering module
Figure BDA00035731835200000411
In, image semantic manifoldConverter with a voltage regulator
Figure BDA00035731835200000412
Image processing method
Figure BDA00035731835200000413
Mapping to
Figure BDA00035731835200000414
After inversion, the method obtains
Figure BDA00035731835200000415
As follows
Figure BDA00035731835200000416
The first item optimizes the reconstruction quality of a given irregular material target image, the second item optimizes the irregular material target image embedding vector track to be kept in a training domain, and the image semantic manifold converter is used for approximating an inverse pixel level rendering module.
Furthermore, the invention provides an image semantic segmentation method of irregular materials, wherein the inferred values of the quasi-label mask label of the image semantic segmentation of the irregular material target image and the irregular material are based on the reconstruction items
Figure BDA00035731835200000417
By adopting the combined action of learning and sensing the similarity of image blocks and the Mahalanobis distance item of each pixel of the irregular material target image,
Figure BDA00035731835200000418
the specific form is as follows:
Figure BDA00035731835200000419
wherein, delta is an independent hyper-parameter of a value range (0,1) and is obtained by inversion
Figure BDA00035731835200000420
Back to the pixel level rendering module to obtain
Figure BDA00035731835200000421
And
Figure BDA00035731835200000422
the irregular material target image and the image semantic segmentation quasi-annotation mask label which are respectively output by the pixel level rendering module are inferred, and the inference result is
Figure BDA00035731835200000423
And
Figure BDA00035731835200000424
is learning to perceive the image block similarity distance.
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
the invention constructs a semantic segmentation method for an irregular material image by semi-supervised transfer learning, and judges a semantic segmentation task by using a random vector dynamic rendering model of an inferred image and a corresponding quasi-standard mark mask label. The method has the advantages that a small-scale image semantic segmentation marking data set is supplemented based on widely available unlabeled irregular material image data, a model network is trained by combining limited labeled irregular material image semantic segmentation data, the model generalization capability with excellent performance is obtained, the method can be effectively migrated to various different industrial production scenes, the manual labeling workload is remarkably reduced, and the algorithm efficiency is improved.
The prominent transfer learning semantic segmentation method can extract the joint distribution of the irregular material image and the corresponding quasi-standard mark-mask label, and adds the non-regular material image quasi-standard mark-mask label synthesis branch on the deep full convolution network. The method has excellent performance in semantic segmentation of the image of the irregular material in the production scene, model training is carried out by assisting semi-supervised learning of a large amount of easily obtained image original data of the unmarked irregular material, the required scale of a marked data set in training is obviously reduced, the labor cost for constructing a large amount of marked data sets in production practice is effectively reduced, and the production efficiency and economic benefit are obviously improved.
Drawings
Fig. 1 is an overall flowchart of the image semantic segmentation method provided by the present invention.
FIG. 2 is a schematic diagram of an image semantic distribution space of regular materials in a smart factory production scenario.
FIG. 3 is a schematic diagram of an image semantic distribution space of irregular materials in a smart factory production scenario.
FIG. 4 is a feature layer schematic of a pixel level rendering module.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
it will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The invention comprises the algorithm construction of the image semantic segmentation method of irregular materials in the production scene of an intelligent factory based on transfer learning, model training and deployment and implementation reasoning in production, the overall structure is shown as figure 1, and the specific steps are as follows:
step 1: constructing a training data set comprising two sub data sets TaAnd TbWherein the data set TaIs a large scale set of unlabeled images, wherein the images are collected from irregular material in the production scene of the smart factory, denoted as Ta={u1,…,uJIn which uj(J ═ 1,2, …, J) represents each irregular material image; data set TbIs a small-scale image set, the image in the set is from irregular material in the production scene of intelligent factoryThe images are cleaned by data, and each image is marked by semantic segmentation and is represented as Tb={(u1,v1),…,(uL,vL)}. Wherein u isl(L ═ 1,2, …, L) represents each irregular material image, vl(L ═ 1,2, …, L) represents the image corresponding to ulThe semantic segmentation labeling mask label of (1); in which a sub data set TaIs much larger than the sub data set TbI.e., J > L. A large amount of unmarked data and a small amount of marked data are used for semi-supervised learning, so that the cost of manual marking is obviously reduced.
Step 2: and constructing a pixel-level rendering module and training on the basis of the data set. Pixel level rendering module, denoted as
Figure BDA0003573183520000051
The jump connection design with residual errors comprises 5 serial feature layers (see figure 4), wherein the serial numbers are a feature layer No. 1, a feature layer No. 2, … and a feature layer No. 5 in sequence from the input end, and an additional branch is added to each feature layer to output a semantic segmentation quasi-label mask v and an image output u. The pixel level rendering module is to
Figure BDA0003573183520000052
As an input to the process, the process may,
Figure BDA0003573183520000053
is a random noise space with a random variable denoted h, i.e.
Figure BDA0003573183520000054
Random variable h obeys a standard normal distribution
Figure BDA0003573183520000055
Pixel level rendering module
Figure BDA0003573183520000061
The objective function employed in the training is
Figure BDA0003573183520000062
In the form of
Figure BDA0003573183520000063
Wherein the sigmoid function sigmod (t) S (t) 1/(1+ e)-t) The training objective is to minimize the objective function by back-propagation gradient descent
Figure BDA0003573183520000064
And step 3: will be provided with
Figure BDA0003573183520000065
As input, p (h) is first converted into a new spatial distribution p (r) by means of the fully connected network, the fully convolutional network and the 1 x 1 convolutional layer, the space being represented as
Figure BDA0003573183520000066
The conversion criterion is that the spatial distribution p (r) better conforms to or approximates the image semantic distribution of irregular materials in the production scene of the intelligent factory, wherein the random variable
Figure BDA0003573183520000067
After affine transformation, these newly adjusted distributed noise variables r are fed to the feature layer of the pixel-level rendering module. Outputting an image
Figure BDA0003573183520000068
And semantic segmentation quasi-annotation mask label
Figure BDA0003573183520000069
Wherein
Figure BDA00035731835200000610
Is the output image distribution space and is,
Figure BDA00035731835200000611
image semantic segmentation quasi-annotation mask label distribution space for irregular materialAnd (3) removing the solvent. The form is
Figure BDA00035731835200000612
And 4, step 4: construction of irregular material rendering image discriminator
Figure BDA00035731835200000613
Wherein
Figure BDA00035731835200000614
True images of irregular material and rendered images of irregular material applied in production scene: (
Figure BDA00035731835200000615
Representing the real number domain), the excitation pixel-level rendering module constructs a realistic image, which is implemented based on the residual structure. Distinguishing device
Figure BDA00035731835200000616
The objective function employed in the training is
Figure BDA00035731835200000617
In the form of
Figure BDA00035731835200000618
The training objective is to maximize the objective function by the backpropagation gradient descent method
Figure BDA00035731835200000619
And 5: construction of irregular material rendering image-label pair discriminator
Figure BDA00035731835200000620
Wherein
Figure BDA00035731835200000621
Discrimination of rendered by concatenating input image and semantically segmented quasi-annotated mask labelsIrregular material image-label pairs and real production scenario irregular material image-label pairs. The process achieves discrimination of alignment between the irregular material composite image and the label, and misaligned irregular material image-label pairs are detected as "pseudo-examples". In order to realize the strong consistency judgment between the rendered image and the label, the invention adopts a multi-scale judger system architecture to realize the function. Distinguishing device
Figure BDA00035731835200000622
The objective function employed in the training is
Figure BDA00035731835200000623
In the form of
Figure BDA00035731835200000624
The training objective is to maximize the objective function by the backpropagation gradient descent method
Figure BDA00035731835200000625
Step 6: set image semantic manifold converter
Figure BDA00035731835200000626
Direct mapping of an image u of an irregular material in a production scene to
Figure BDA00035731835200000627
Space (therein)
Figure BDA00035731835200000628
The space is
Figure BDA00035731835200000629
Spatial enhancement space). Image semantic manifold converter
Figure BDA00035731835200000630
Extracting multi-level features using a feature pyramid network as a backboneMapping these features to full convolutional network based
Figure BDA00035731835200000631
A space. Training image semantic manifold converter
Figure BDA00035731835200000632
The pixel-level rendering module must be fixed
Figure BDA00035731835200000633
Model parameters of, i.e. this phase
Figure BDA00035731835200000634
And does not participate in training. Image semantic manifold converter
Figure BDA00035731835200000635
Training an objective function
Figure BDA00035731835200000636
Is in the form of
Figure BDA00035731835200000637
Wherein
Figure BDA0003573183520000071
Expressing KL divergence, and summing the KL divergence loss of all pixels of the irregular material image pixel by pixel, wherein the first term represents the supervision loss. Wherein the mathematical definition of KL divergence for discrete events
Figure BDA0003573183520000072
Figure BDA0003573183520000073
Wherein A and B represent two random variables respectively, and the corresponding probability distribution is PA(xi) And PB(xi) And i represents a discrete random variable number. The sum of the last two terms of the objective function represents the unsupervised lossWherein alpha is the value range [0,1 ] under the condition of irregular material image semantic distribution]Is determined by the parameter (c) of (c),
Figure BDA0003573183520000074
an irregular material image rendering backbone representing a pixel-level rendering module,
Figure BDA0003573183520000075
and representing a semantic segmentation quasi-annotation mask label rendering branch of the irregular material image of the pixel level rendering module.
Figure BDA0003573183520000076
Is a learning perception image block similarity distance used for measuring a characteristic space of a depth residual error network (ResNet) of a pre-training model of an image network (ImageNet)
Figure BDA0003573183520000077
The distance between the first and second electrodes,
Figure BDA0003573183520000078
representing the Mahalanobis distance (Mahalanobis) between the data points x, y, where
Figure BDA0003573183520000079
Where Σ is the covariance matrix of the multidimensional random variable, T denotes the transposition. Under the guidance of the above objective function training, the image semantic manifold converter
Figure BDA00035731835200000710
Mapping image u to embedded random variables
Figure BDA00035731835200000711
The input image can be rendered again, and the irregular material image is marked, so that the semantic segmentation quasi-annotation mask label of the irregular material image is correspondingly rendered.
And 7: by image semantic manifold converter
Figure BDA00035731835200000712
Implementation of
Figure BDA00035731835200000713
Enhanced space of space
Figure BDA00035731835200000714
In the process of algorithmic inference, the embedding of a new image is inferred, where embedding refers to a vector representation. And is disclosed in
Figure BDA00035731835200000715
The reasoning is performed differently in space, in
Figure BDA00035731835200000716
Operating in the space, independently modeling the random noise vector r in the space, and when r is independently modeled corresponding to the number 1 to 5 characteristic layers, the obtained space is named as an enhanced space
Figure BDA00035731835200000717
Wherein
Figure BDA00035731835200000718
During the subsequent algorithm inference process, the pixel level rendering module is
Figure BDA00035731835200000719
In which embedded reasoning is performed, i.e.
Figure BDA00035731835200000720
See the next step.
And 8: an algorithm of an image semantic segmentation method based on a rendering model is deployed and inferred in production, and a target image of an irregular material in a production scene of an intelligent factory is given
Figure BDA00035731835200000721
Deriving optimal pixel-level labels
Figure BDA00035731835200000722
First embedding a target image into a pixel-level rendering modelEmbedding space of block
Figure BDA00035731835200000723
Middle and image semantic manifold converter
Figure BDA00035731835200000724
Image processing method
Figure BDA00035731835200000725
Mapping to
Figure BDA00035731835200000726
After inversion, obtaining
Figure BDA00035731835200000727
As follows
Figure BDA00035731835200000728
Wherein the first term optimizes the reconstruction quality of a given irregular material target image, the second term normalizes the embedding of the irregular material target image into a vector track to be kept in a training domain, the image semantic manifold transformer is used for approximately reversing a pixel level rendering module,
Figure BDA00035731835200000729
the specific meanings are described in the next step.
And step 9: the inferred value of the semantic segmentation quasi-annotation mask label of the target image of the irregular material and the image of the irregular material is a reconstructed item based on the formula
Figure BDA00035731835200000730
By adopting the combined action of learning and sensing the similarity of image blocks and the Mahalanobis distance item of each pixel of the irregular material target image,
Figure BDA00035731835200000731
the specific form is as follows:
Figure BDA00035731835200000732
where δ is an independent hyper-parameter of the span (0, 1). Obtained by inversion
Figure BDA00035731835200000733
Back to the pixel level rendering module to obtain
Figure BDA0003573183520000081
And
Figure BDA0003573183520000082
the irregular material target image and the image semantic segmentation quasi-annotation mask label which are respectively output by the pixel level rendering module are inferred, and the inference result is
Figure BDA0003573183520000083
And
Figure BDA0003573183520000084
is learning to perceive the image block similarity distance.
As shown in FIG. 2, the image semantic distribution space of regular material in the factory production scene
Figure BDA0003573183520000085
In which random variables are
Figure BDA0003573183520000086
The target to be detected is that with regular geometric appearance (such as circle, ellipse, square and rectangle), from the perspective of semantic segmentation, these appearance features are abstracted into finite-column semantic features, so that the geometric appearance and the geometric size have definite statistical distribution, and correspond to the statistical distribution space of the finite-column semantic features, so we name the image semantic distribution space as regular material
Figure BDA0003573183520000087
Per-type geometry limited-range semantic features in an image are embeddedIs represented by a random variable vector h. Will be provided with
Figure BDA0003573183520000088
As input, p (h) is converted into a new spatial distribution p (r) through a full connection network, a full convolution network and a 1 × 1 convolution layer, wherein r represents a random variable vector embedded in each type of geometric finite-column semantic features in the irregular material image, see fig. 3, and fig. 3 is an image semantic distribution spatial distribution of the irregular material in the smart factory production scene
Figure BDA0003573183520000089
In which random variables are
Figure BDA00035731835200000810
Image semantic distribution space corresponding to irregular materials in production scene of smart factory
Figure BDA00035731835200000811
In which random variables are
Figure BDA00035731835200000812
The criterion for transformation is that the spatial distribution p (r) better fits or approximates the statistical distribution of the image semantic limited-range semantic features of the irregular material. After the streaming transformation, these newly adjusted distributed noise variables r are fed to the feature layer of the pixel-level rendering module (see the feature layer of the pixel-level rendering module shown in fig. 4). Outputting an image
Figure BDA00035731835200000813
And semantic segmentation quasi-annotation mask label
Figure BDA00035731835200000814
Wherein
Figure BDA00035731835200000815
Is the output image distribution space and is,
Figure BDA00035731835200000816
is irregularAnd (3) performing semantic segmentation on the image of the material to obtain a quasi-annotation mask label distribution space. The form is
Figure BDA00035731835200000817
By adopting a random gradient descent method,
Figure BDA00035731835200000818
calculation formula
Figure BDA00035731835200000819
Therein
Figure BDA00035731835200000820
The term further optimizes the quality of the image reconstruction of the input given irregular material,
Figure BDA00035731835200000821
Figure BDA00035731835200000822
item pair rstrAnd constraining so as to keep the hyperspace in which the image semantic manifold transformer is positioned in the training domain space, wherein the image semantic manifold transformer is trained to approximate the inverse pixel-level rendering module. By adopting the regularization of the hyper-parameter beta constraint, the quality of the image semantic segmentation quasi-annotation mask label for rendering the irregular material is effectively ensured by deducing the image (the unlabeled data used by semi-supervised learning) of the irregular material outside the training domain space.
In the deployment and inference implementation process in production, the semantic segmentation quasi-annotation mask label of the irregular material image is optimized through an image semantic manifold converter network and test, then the target irregular material image is embedded into a joint potential semantic space, and finally the semantic segmentation quasi-annotation mask label is rendered from an inferred embedded random vector.
The inferred value of the quasi-label mask label is based on the semantic segmentation of the target image of the irregular material and the image of the irregular material
Figure BDA00035731835200000823
And reconstructing the image by adopting the joint action of learning perception image block similarity and every pixel Mahalanobis distance item of the irregular material target image. Where δ is an independent hyper-parameter of the span (0, 1). Obtained by inversion
Figure BDA0003573183520000091
Back to the pixel level rendering module to obtain
Figure BDA0003573183520000092
Due to the fact that
Figure BDA0003573183520000093
Optimized to minimize
Figure BDA0003573183520000094
And
Figure BDA0003573183520000095
reconstruction error therebetween, so that
Figure BDA0003573183520000096
Is approximately equal to
Figure BDA0003573183520000097
In addition, since the pixel-level rendering module is trained to align the synthesized segmentation labels and images, the results are obtained
Figure BDA0003573183520000098
Is a reconstructed image
Figure BDA0003573183520000099
Most preferably a label. Therefore, the image semantic segmentation quasi-annotation mask label of the irregular material finally output by the model
Figure BDA00035731835200000910
Is an irregular target image of a material
Figure BDA00035731835200000911
Iterative optimal image semantic segmentation
Figure BDA00035731835200000912
According to probability equal to
Figure BDA00035731835200000913
Therefore, the invention realizes the transfer learning of the semi-supervised method by combining a large amount of non-regular material sample images without labeling with a small amount of labeled samples, and successfully solves the problem of semantic segmentation edge detection of the non-regular material (small sample attribute of labeled data).
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. An image semantic segmentation method for irregular materials is characterized by comprising the following steps:
step S1, constructing a training data set, which comprises two subdata sets, one is an unlabeled irregular material image set, and the other is a labeled irregular material image set;
step S2, constructing a pixel level rendering module and training, inputting a vector form of a random variable during training, and outputting a semantic segmentation standard marking mask and an image; the input random variable generation process is obtained by converting random variable probability distribution of regular material image semantic space into new space distribution through a depth network, and the conversion standard is more in line with the image semantic distribution of irregular materials in a production scene;
step S3, constructing an irregular material rendering image discriminator to be applied to an irregular material real image and an irregular material rendering image in a production scene; constructing an irregular material rendering image-label pair discriminator for discriminating a rendered irregular material image-label pair and a real production scene irregular material image-label pair;
step S4, setting an image semantic manifold converter, mapping the image of the irregular material in the production scene to an embedded random variable to obtain an embedded enhanced space, and correspondingly rendering the semantic segmentation quasi-annotation mask label of the irregular material image by combining and marking the irregular material image; in the inference process, the embedding of a new image is calculated firstly, wherein the embedding refers to a vector representation;
s5, a target image of the irregular material is given, the target image is embedded into an embedded enhanced space of a pixel level rendering module, an image semantic manifold converter maps the target image to the embedded enhanced space, an inversion estimation value of an enhanced space vector is obtained through calculation, and an optimal pixel level label is obtained through reasoning;
step S6, learning and sensing image block similarity and Mahalanobis distance item per pixel of the irregular material target image are combined, and the inversion estimation value of the enhanced space vector is combined and transmitted back to the pixel level rendering module to obtain the image semantic segmentation result of the irregular material, namely: and (3) carrying out semantic segmentation on the target image and the quasi-annotation mask label.
2. The method for semantically segmenting the image of the irregular material according to claim 1, wherein in step S1, a training data set is constructed, which comprises two sub data sets TaAnd TbWherein the data set TaIs an unmarked irregular material image set, Ta={u1,...,uJH, where ujRepresents each irregular material image, J1, 2.., J; data set TbIs an irregular material image set cleaned by data, and each image is labeled by semantic segmentation, Tb={(u1,v1),...,(uL,vL) In which ulRepresenting each irregular material image, vlThe representation corresponds to an image ulThe semantic segmentation of (1) is labeled with a mask label, L1, 2.
3. The method according to claim 1, wherein the pixel-level rendering module in step S2 has a residual jump connection design, and has 5 concatenated feature layers in sequence from the input end, and an additional branch is added to each feature layer to output the semantic segmentation quasi-label mask v and the image output u.
4. The method according to claim 3, wherein training is performed by using a semantic segmentation method for the image of the irregular material
Figure FDA0003573183510000011
As an input to the process, the process may,
Figure FDA0003573183510000012
is a random noise space with a random variable denoted h, i.e.
Figure FDA0003573183510000013
Random variable h obeys a standard normal distribution
Figure FDA0003573183510000021
First converting p (h) into a new spatial distribution p (r) through the fully-connected network, the fully-convolutional network and the 1 × 1 convolutional layer, the space being represented as
Figure FDA0003573183510000022
The conversion criterion is that the spatial distribution p (r) better conforms to or approximates the image semantic distribution of irregular materials in the production scene of the intelligent factory, wherein the random variable
Figure FDA0003573183510000023
After affine transformation, these newly adjusted distributed noise variables r are fed to the feature layer of the pixel-level rendering module; outputting an image
Figure FDA0003573183510000024
And semantic segmentation quasiLabel mask label
Figure FDA00035731835100000233
Wherein
Figure FDA0003573183510000025
Is the output image distribution space and is,
Figure FDA00035731835100000232
is the image semantic segmentation quasi-annotation mask label distribution space of irregular materials, and the form is
Figure FDA0003573183510000026
5. The method for semantically segmenting the image of the irregular material according to claim 1, wherein the pixel-level rendering module
Figure FDA0003573183510000027
The objective function employed in the training is
Figure FDA0003573183510000028
In the form of
Figure FDA0003573183510000029
Wherein the sigmoid function sigmod (t) S (t) 1/(1+ e)-t) The training objective is to minimize the objective function by back-propagation gradient descent
Figure FDA00035731835100000229
6. The method for image semantic segmentation of irregular material according to claim 1, wherein in step S3, an irregular material rendering image discriminator is constructed
Figure FDA00035731835100000210
Wherein
Figure FDA00035731835100000211
Applied to the real images of the irregular materials and the rendered images of the irregular materials in the production scene,
Figure FDA00035731835100000212
representing a real number domain, exciting a pixel level rendering module to construct a vivid image, and realizing the function based on a residual error structure; distinguishing device
Figure FDA00035731835100000213
The objective function employed in the training is
Figure FDA00035731835100000214
The form is as follows:
Figure FDA00035731835100000215
the training objective is to maximize the objective function by the backpropagation gradient descent method
Figure FDA00035731835100000216
7. The method for image semantic segmentation of irregular material according to claim 1, wherein in step S3, an irregular material rendering image-label pair discriminator is constructed
Figure FDA00035731835100000217
Wherein
Figure FDA00035731835100000218
Method for distinguishing rendered irregular material by using input image and semantic segmentation quasi-annotation mask label in series connectionImage-label pair and real production scene irregular material image-label pair, adopting multiscale discriminator system structure and discriminator
Figure FDA00035731835100000219
The objective function employed in the training is
Figure FDA00035731835100000230
In the form of
Figure FDA00035731835100000220
The training objective is to maximize the objective function by the backpropagation gradient descent method
Figure FDA00035731835100000231
8. The method for image semantic segmentation of irregular materials according to claim 1, wherein in step S4, the image semantic manifold converter
Figure FDA00035731835100000221
Extracting multi-level features using a feature pyramid network as a backbone, mapping these features to full convolution based networks
Figure FDA00035731835100000222
A space; training image semantic manifold converter
Figure FDA00035731835100000223
The pixel-level rendering module must be fixed
Figure FDA00035731835100000224
Model parameters of, i.e. this phase
Figure FDA00035731835100000225
Image semantic manifold converter without participating in training
Figure FDA00035731835100000226
Training an objective function
Figure FDA00035731835100000227
Is in the form of
Figure FDA00035731835100000228
Wherein
Figure FDA0003573183510000031
Expressing KL divergence, summing the pixel-by-pixel KL divergence losses for all pixels of the irregular material image, the first term representing the supervised loss, where the mathematical definition of KL divergence for discrete events
Figure FDA0003573183510000032
Figure FDA0003573183510000033
Wherein A and B represent two random variables respectively, and the corresponding probability distributions are PA(xi) And PB(xi) I represents a discrete random variable number; the sum of the last two terms of the objective function represents the unsupervised loss, wherein alpha is the value range [0,1 ] under the condition of irregular material image semantic distribution]Is determined by the parameter (c) of (c),
Figure FDA0003573183510000034
an irregular material image rendering backbone representing a pixel-level rendering module,
Figure FDA0003573183510000035
marking mask label rendering branches on irregular material image semantic segmentation standards representing a pixel level rendering module;
Figure FDA00035731835100000328
learning and sensing image block similarity distance for measuring feature space of image network pre-training model depth residual error network
Figure FDA0003573183510000036
The distance between the first and second electrodes,
Figure FDA0003573183510000037
representing the mahalanobis distance between the data points x, y, wherein
Figure FDA0003573183510000038
Figure FDA0003573183510000039
Where Σ is the covariance matrix of the multidimensional random variable, T denotes the transposition.
9. The method for semantic image segmentation of irregular materials as claimed in claim 1, wherein step S5 specifies a target image of an irregular material in a production scenario of a smart factory
Figure FDA00035731835100000310
Deriving optimal pixel-level labels
Figure FDA00035731835100000311
Firstly, the target image is embedded into an embedding space of a pixel-level rendering module
Figure FDA00035731835100000312
Middle and image semantic manifold converter
Figure FDA00035731835100000313
Image processing method
Figure FDA00035731835100000314
Mapping to
Figure FDA00035731835100000315
After inversion, the method obtains
Figure FDA00035731835100000316
As follows
Figure FDA00035731835100000317
The first item optimizes the reconstruction quality of a given irregular material target image, the second item optimizes the irregular material target image embedding vector track to be kept in a training domain, and the image semantic manifold converter is used for approximating an inverse pixel level rendering module.
10. The method for image semantic segmentation of irregular materials according to claim 9, characterized in that the inferred value of the quasi-annotated mask label of the image semantic segmentation of the irregular material target image and the irregular material is based on the reconstruction term
Figure FDA00035731835100000318
By adopting the combined action of learning and sensing the similarity of image blocks and the Mahalanobis distance item of each pixel of the irregular material target image,
Figure FDA00035731835100000327
the specific form is as follows:
Figure FDA00035731835100000319
wherein, delta is an independent hyper-parameter of a value range (0,1) and is obtained by inversion
Figure FDA00035731835100000320
Back to the pixel level rendering module to obtain
Figure FDA00035731835100000321
Figure FDA00035731835100000322
And
Figure FDA00035731835100000323
the irregular material target image and the image semantic segmentation quasi-annotation mask label which are respectively output by the pixel level rendering module are inferred, and the inference result is
Figure FDA00035731835100000324
And
Figure FDA00035731835100000325
Figure FDA00035731835100000326
is learning to perceive the image block similarity distance.
CN202210331353.4A 2022-03-30 2022-03-30 Image semantic segmentation method for irregular material of transfer learning Withdrawn CN114627297A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210331353.4A CN114627297A (en) 2022-03-30 2022-03-30 Image semantic segmentation method for irregular material of transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210331353.4A CN114627297A (en) 2022-03-30 2022-03-30 Image semantic segmentation method for irregular material of transfer learning

Publications (1)

Publication Number Publication Date
CN114627297A true CN114627297A (en) 2022-06-14

Family

ID=81903719

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210331353.4A Withdrawn CN114627297A (en) 2022-03-30 2022-03-30 Image semantic segmentation method for irregular material of transfer learning

Country Status (1)

Country Link
CN (1) CN114627297A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738310A (en) * 2020-06-04 2020-10-02 科大讯飞股份有限公司 Material classification method and device, electronic equipment and storage medium
CN114067168A (en) * 2021-10-14 2022-02-18 河南大学 Cloth defect image generation system and method based on improved variational self-encoder network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738310A (en) * 2020-06-04 2020-10-02 科大讯飞股份有限公司 Material classification method and device, electronic equipment and storage medium
CN114067168A (en) * 2021-10-14 2022-02-18 河南大学 Cloth defect image generation system and method based on improved variational self-encoder network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DAIQING LI ET AL.: "Semantic Segmentation with Generative Models:Semi-Supervised Learning and Strong Out-of-Domain Generalization", 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), pages 8296 - 8307 *

Similar Documents

Publication Publication Date Title
CN112465111A (en) Three-dimensional voxel image segmentation method based on knowledge distillation and countertraining
CN110363068B (en) High-resolution pedestrian image generation method based on multiscale circulation generation type countermeasure network
CN112967178B (en) Image conversion method, device, equipment and storage medium
CN114820341A (en) Image blind denoising method and system based on enhanced transform
Liu et al. RB-Net: Training highly accurate and efficient binary neural networks with reshaped point-wise convolution and balanced activation
CN113793261A (en) Spectrum reconstruction method based on 3D attention mechanism full-channel fusion network
CN115829876A (en) Real degraded image blind restoration method based on cross attention mechanism
CN113066074A (en) Visual saliency prediction method based on binocular parallax offset fusion
Yang et al. A survey of super-resolution based on deep learning
CN109087247A (en) The method that a kind of pair of stereo-picture carries out oversubscription
CN116823647A (en) Image complement method based on fast Fourier transform and selective attention mechanism
Xu et al. Transformer image recognition system based on deep learning
CN108596831B (en) Super-resolution reconstruction method based on AdaBoost example regression
Wu et al. Lightweight stepless super-resolution of remote sensing images via saliency-aware dynamic routing strategy
CN114627297A (en) Image semantic segmentation method for irregular material of transfer learning
CN112613405B (en) Method for recognizing actions at any visual angle
CN115035377A (en) Significance detection network system based on double-stream coding and interactive decoding
Zhang et al. Image super-resolution via RL-CSC: when residual learning meets convolutional sparse coding
CN114331894A (en) Face image restoration method based on potential feature reconstruction and mask perception
Tian et al. Depth inference with convolutional neural network
Nie et al. Image restoration from patch-based compressed sensing measurement
Tan et al. DBSwin: Transformer based dual branch network for single image deraining
Liao et al. Cross-Attention and Cycle-Consistency-Based Haptic To Image Inpainting
Chen et al. Learning Spatiotemporal Features for Video Semantic Segmentation Using 3D Convolutional Neural Networks
CN118470048B (en) Real-time feedback interactive tree image matting method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220614